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Preface to the Second Edition 


The first edition of this book appeared a decade ago. This is a revised 
expanded version. My goal has remained the same: to provide a text for 
a second course in matrix theory and linear algebra accessible to advanced 
undergraduate and beginning graduate students. Through the course, stu- 
dents learn, practice, and master basic matrix results and techniques (or 
matrix kung fu) that are useful for applications in various fields such as 
mathematics, statistics, physics, computer science, and engineering, etc. 

Major changes for the new edition are: eliminated errors, typos, and 
mistakes found in the first edition; expanded with topics such as matrix 
functions, nonnegative matrices, and (unitarily invariant) matrix norms; 
included more than 1000 exercise problems; rearranged some material from 
the previous version to form a new chapter, Chapter 4, which now contains 
numerical ranges and radii, matrix norms, and special operations such as 
the Kronecker and Hadamard products and compound matrices; and added 
a new chapter, Chapter 10, “Majorization and Matrix Inequalities”, which 
presents a variety of inequalities on the eigenvalues and singular values of 
matrices and unitarily invariant norms. 

I am thankful to many mathematicians who have sent me their com- 
ments on the first edition of the book or reviewed the manuscript of this 
edition: Liangjun Bai, Jane Day, Farid O. Farid, Takayuki Furuta, Geoffrey 
Goodson, Roger Horn, Zejun Huang, Minghua Lin, Dennis Merino, George 
P.H. Styan, Gotz Trenkler, Qingwen Wang, Yimin Wei, Changqing Xu, Hu 
Yang, Xingzhi Zhan, Xiaodong Zhang, and Xiuping Zhang. I also thank 
Farquhar College of Arts and Sciences at Nova Southeastern University for 
providing released time for me to work on this project. 

Readers are welcome to communicate with me via e-mail. 


Fuzhen Zhang 

Fort Lauderdale 

May 23, 2011 
zhang@nova.edu 
www.nova.edu/~ zhang 
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Preface 


It has been my goal to write a concise book that contains fundamen- 
tal ideas, results, and techniques in linear algebra and (mainly) in matrix 
theory which are accessible to general readers with an elementary linear 
algebra background. I hope this book serves the purpose. 

Having been studied for more than a century, linear algebra is of central 
importance to all fields of mathematics. Matrix theory is widely used in 
a variety of areas including applied math, computer science, economics, 
engineering, operations research, statistics, and others. 

Modern work in matrix theory is not confined to either linear or alge- 
braic techniques. The subject has a great deal of interaction with combina- 
torics, group theory, graph theory, operator theory, and other mathematical 
disciplines. Matrix theory is still one of the richest branches of mathematics; 
some intriguing problems in the field were long standing, such as the Van 
der Waerden conjecture (1926-1980), and some, such as the permanental- 
dominance conjecture (since 1966), are still open. 

This book contains eight chapters covering various topics from sim- 
ilarity and special types of matrices to Schur complements and matrix 
normality. Each chapter focuses on the results, techniques, and methods 
that are beautiful, interesting, and representative, followed by carefully se- 
lected problems. Many theorems are given different proofs. The material 
is treated primarily by matrix approaches and reflects the author’s tastes. 

The book can be used as a text or a supplement for a linear algebra 
or matrix theory class or seminar. A one-semester course may consist of 
the first four chapters plus any other chapter(s) or section(s). The only 
prerequisites are a decent background in elementary linear algebra and 
calculus (continuity, derivative, and compactness in a few places). The 
book can also serve as a reference for researchers and instructors. 

The author has benefited from numerous books and journals, including 
The American Mathematical Monthly, Linear Algebra and Its Applications, 
Linear and Multilinear Algebra, and the International Linear Algebra Soci- 
ety (ILAS) Bulletin Image. This book would not exist without the earlier 
works of a great number of authors (see the References). 

I am grateful to the following professors for many valuable suggestions 
and input and for carefully reading the manuscript so that many errors 
have been eliminated from the earlier version of the book: 

Professor R. B. Bapat (Indian Statistical Institute), 

Professor L. Elsner (University of Bielefeld), 

Professor R. A. Horn (University of Utah), 


Preface 


Professor T.-G. Lei (National Natural Science Foundation of China), 
Professor J.-S. Li (University of Science and Technology of China), 
Professor R.-C. Li (University of Kentucky), 

Professor Z.-S. Li (Georgia State University), 

Professor D. Simon (Nova Southeastern University), 

Professor G. P.H. Styan (McGill University), 

Professor B.-Y. Wang (Beijing Normal University), and 

Professor X.-P. Zhang (Beijing Normal University). 


F. Zhang 

Ft. Lauderdale 
March 5, 1999 
zhang@nova.edu 
www.nova.edu/~ zhang 
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Frequently Used Notation and Terminology 


|A|, 12, 83, 164 
(u,v), 27 

|| - |], 28, 118 
Ker(A), 17 
Im(A), 17 
p(A), 109 
Omax(A), 109 
Amax(A), 124 
A>0, 81 
A>B,81 
AoB, 117 
A®B, 117 

L ~<w Y, 326 
L ~wiog Y, 344 


dimension of vector space V 

nxn (i.e., n-square) matrices with complex entries 
matrix A with (i, 7)-entry a;; 

identity matrix 

transpose of matrix A 

conjugate of matrix A 

conjugate transpose of matrix A, i.e., A* = A’ 

inverse of matrix A 

rank of matrix A 

trace of matrix A 

determinant of matrix A 

determinant for a block matrix A or (A*A)!/? or (|ai;|) 
inner product of vectors u and v 

norm of a vector or a matrix 

kernel or null space of A, i.e., Ker(A) = {a : Ax = 0} 
image space of A, i.e., Im(A) = {Ax} 

spectral radius of matrix A 

largest singular value (spectral norm) of matrix A 
largest eigenvalue of matrix A 

A is positive semidefinite (or all aj; > 0 in Section 5.7) 
A — B is positive semidefinite (or a;; > b;; in Section 5.7) 
Hadamard (entrywise) product of matrices A and B 
Kronecker (tensor) product of matrices A and B 

weak majorization, i-e., all > 3 eee = yy hold 
weak log-majorization, i.e., all ean a < goa y; hold 


Ann Xn matrix A is said to be 


upper-triangular 
diagonalizable 
similar to B 
unitarily similar 
unitary 


if all entries below the main diagonal are zero 
if P-! AP is diagonal for some invertible matrix P 
if P-'AP = B for some invertible matrix P 
to B if U* AU = B for some unitary matrix U 
if AA* = A*A=J,i8, A? =A* 


positive semidefinite if 2* Ax > 0 for all vectors 2 € C” 


Hermitian 
normal 


if A = A* 
if A*A = AA* 


A € C is an eigenvalue of A € M,, if Ax = Ax for some nonzero x € C”. 
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Frequently Used Theorems 


e Cauchy—Schwarz inequality: Let V be an inner product space over 
a number field (R or C). Then for all vectors x and y in V 


I(x, y)? < (@, 2)(y,y). 
Equality holds if and only if x and y are linearly dependent. 


e Theorem on the eigenvalues of AB and BA: Let A and Bbemxn 
and n x m complex matrices, respectively. Then AB and BA have the 
same nonzero eigenvalues, counting multiplicity. As a consequence, 


tr(AB) = tr(BA). 
e Schur triangularization theorem: For any n-square matrix A, there 
exists an n-square unitary matrix U such that U* AU is upper-triangular. 


e Jordan decomposition theorem: For any n-square matrix A, there 
exists an n-square invertible complex matrix P such that 


A=P"(J,@J20-::@ iP, 


where each J;, i =1,2,...,k, is a Jordan block. 


e Spectral decomposition theorem: Let A be an n-square normal 
matrix with eigenvalues ),A2,...,An. Then there exists an n-square 
unitary matrix U such that 


A= U* diag(A1, A2, seey An)U. 


In particular, if A is positive semidefinite, then all A; > 0; if A is Her- 
mitian, then all \; are real; and if A is unitary, then all |\,| = 1. 


e Singular value decomposition theorem: Let A be an m xn complex 
matrix with rank r. Then there exist an m-square unitary matrix U and 
an n-square unitary matrix V such that 


A=UDV, 


where D is the m x n matrix with (7,7)-entries being the singular values 
of A,i=1,2,...,r, and other entries 0. If m =n, then D is diagonal. 
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CHAPTER 1 


Elementary Linear Algebra Review 
Introduction: We briefly review, mostly without proof, the basic 
concepts and results taught in an elementary linear algebra course. 


The subjects are vector spaces, basis and dimension, linear transfor- 
mations and their eigenvalues, and inner product spaces. 


1.1 Vector Spaces 


Let V be a set of objects (elements) and F be a field, mostly the real 
number field R or the complex number field C throughout this book. 
The set V is called a vector space over F if the operations addition 


utvu, u,ve€V, 
and scalar multiplication 
cv, cEF,veEV, 


are defined so that the addition is associative, is commutative, has 
an additive identity 0 and additive inverse —v in V for each v € V, 
and so that the scalar multiplication is distributive, is associative, 
and has an identity 1 € F for which lv = v for every v € V. 
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Put these in symbols: 


ut+tv€YV for allu,veV. 
cv €V for allce Fandve V. 


utv=v-+u for allu,veV. 

(ut+tv)+w=ut+(v+w) for all u,v,w eV. 

. There is an element 0 € V such that v+0 =v for allu EV. 
. For each uv € V there is an element —v € V so that v+(—v) = 0. 
c(u+v) =cu+cv for allce Fandu,ve V. 

(a+b)v =av+ bv for alla,b€ FandveV. 

. (ab)v = a(bv) for alla,b€ FandveV. 

. lu=v forallveV. 


; ? 
cv,e> 1 
u Uv 


Figure 1.1: Vector addition and scalar multiplication 


CON AMAR WN 


— 
—) 


We call the elements of a vector space vectors and the elements 
of the field scalars. For instance, R”, the set of real column vectors 


, also written as (21,%2,...,%n)" 

In 
(r for transpose) is a vector space over R with respect to the addition 
(21, %2,---,%n)" + (Y1, Ya,---5 Yn)” = (@1 + y1, V2 + Yo2,---52nt+Yn)* 
and the scalar multiplication 


Oy Hey ccita) = ny pete. C%,) >. CER, 
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Note that the real row vectors also form a vector space over R; 
and they are essentially the same as the column vectors as far as 
vector spaces are concerned. For convenience, we may also consider 
IR” as a row vector space if no confusion is caused. However, in the 
matrix-vector product Ax, obviously x needs to be a column vector. 

Let S be a nonempty subset of a vector space V over a field F. 
Denote by Span S' the collection of all finite linear combinations of 
the vectors in S; that is, Span S' consists of all vectors of the form 


Cyvy t+ cgvg +++ +EMY, t=1,2,...,q EF, 4 € S, 
The set Span S is also a vector space over F. If Span S = V, then 
every vector in V can be expressed as a linear combination of vectors 
in S. In such cases we say that the set S spans the vector space V. 


A set S = {v1,v2,..., Uz} is said to be linearly independent if 


C1U1 + cgvq +--+ + c,0% = 0 


holds only when cy = co = --:: = czy = O. If there are also nontrivial 
solutions, i.e., not all c are zero, then S' is linearly dependent. 

For example, both {(1,0),(0,1),(1,1)} and {(1,0),(0,1)} span 
R?. The first set is linearly dependent; the second one is linearly 
independent. The vectors (1,0) and (1,1) also span R?. 

A basis of a vector space V is a linearly independent set that spans 
V. If V possesses a basis of an n-vector set S = {v1,v2,...,Un}, we 
say that V is of dimension n, written as dim V = n. Conventionally, 
if V = {0}, we write dimV = 0. If any finite set cannot span V, 
then V is infinite-dimensional and we write dimV = oo. Unless 
otherwise stated, we assume throughout the book that the vector 
spaces are finite-dimensional, as we mostly deal with finite matrices, 
even though some results hold for infinite-dimensional spaces. 

For instance, C is a vector space of dimension 2 over R with basis 
{1,i}, where i = /—1, and of dimension 1 over C with basis {1}. 

C”, the set of row (or column) vectors of n complex components, 
is a vector space over C having standard basis 


e, = (1,0,...,0,0), eg = (0,1,...,0,0), ..., en = (0,0,...,0,1). 
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If {u1,u2,.-.,Un} is a basis for a vector space V of dimension n, 
then every x in V can be uniquely expressed as a linear combination 
of the basis vectors: 


L= LU + LQUQ + +++ + LnUn,; 


where the x; are scalars. The n-tuple (a1, 22,...,%n) is called the 
coordinate of vector x with respect to the basis. 
Let V bea vector space of dimension n, and let {v1, v2,..., Uz} be 


a linearly independent subset of V. Then & < n, and it is not difficult 
to see that if k < n, then there exists a vector vg, € V such that 
the set {v1,v2,.--, Uk, Uk+1} is linearly independent (Problem 16). It 
follows that the set {v1,v2,..., vx} can be extended to a basis of V. 
Let W be a subset of a vector space V. If W is also a vector space 
under the addition and scalar multiplication for V, then W is called 
a subspace of V. One may check (Problem 9) that W is a subspace if 
and only if W is closed under the addition and scalar multiplication. 
For subspaces V; and V2, the sum of Vi; and V4 is defined to be 


Vit. ={up +2: u.€Vi, v2 € Vo}. 


It follows that the sum V; + V2 is also a subspace. In addition, 
the intersection V; 1 V2 is a subspace, and 


Yn CV,CV4+ V2, 1=1,2. 
The sum Vj + V9 is called a direct sum, symbolized by V; @ Va, if 
vp +2 =0, v4 EV, v2 EV. > v=v=0. 
One checks that in the case of a direct sum, every vector in Vi @ V2 


is uniquely written as a sum of a vector in V; and a vector in Vo. 


Views Mi 


L\ 


o Ve 


Figure 1.2: Direct sum 
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Theorem 1.1 (Dimension Identity) Let V be a finite-dimensional 
vector space, and let V; and V2 be subspaces of V. Then 


dim V; + dim Vg = dim(V; + V2) + dim(V;, N Va). 


The proof is done by first choosing a basis {u1,..., uz} for VinVa, 


extending it to a basis {u1,...,Uxz,Uk+1,---,Us} for Vj and a basis 
{u1,...,Uk, Wet41,---, We} for V2, and then showing that 
Ath isalhing MR Ui 52s0.¢ Ose Wheis cna, Oey 


is a basis for Vj + Vo. 
It follows that subspaces V; and V2 contain nonzero common 
vectors if the sum of their dimensions exceeds dim V. 


Problems 


1. Show explicitly that R? is a vector space over R. Consider R? over 
C with the usual addition. Define c(x, y) = (cx, cy), cE C. Is R? a 
vector space over C? What if the “scalar multiplication” is defined as 


c(x, y) = (ax + by, ax — by), wherec=a+bi, a,b € R? 


2. Can a vector space have two different additive identities? Why? 


3. Show that F,,[z], the collection of polynomials over a field F with 
degree at most n, is a vector space over F with respect to the ordinary 
addition and scalar multiplication of polynomials. Is F[a], the set of 
polynomials with any finite degree, a vector space over F? What is 
the dimension of F,,[x] or F[a]? 


4. Determine whether the vectors v; = 1+ 2 — 2x7, vg = 24+ 52 — 2?, 
and v3 = « +2? in Fo[2] are linearly independent. 


5. Show that {(1,7), (¢,-1)} is a linearly independent subset of C? over 
the real R but not over the complex C. 


6. Determine whether R?, with the operations 


(21, y1) + (@2, yo) = (#122, yry2) 


and 
c(x1, Y1) _ (cri, cy), 


is a vector space over R. 


10. 


11. 


12. 


13. 


14. 


15. 
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. Let V be the set of all real numbers in the form 


a+bV2+cvV5, 


where a, b, and c are rational numbers. Show that V is a vector space 
over the rational number field Q. Find dim V and a basis of V. 


. Let V be a vector space. If u,v,w € V are such that au+bu+cw = 0 


for some scalars a,b,c, ac £ 0, show that Span{u, v} = Span{v, w}. 


. Let V be a vector space over F and let W be a subset of V. Show 


that W is a subspace of V if and only if for all u,v € W and ce F 


utveEW and cue W. 


Is the set {(a, y) € R? : 2x — 3y = 0} a subspace of R?? How about 
{(x, y) € R? : 2x — 3y = 1}? Give a geometric explanation. 


Show that the set {(x, y—2, y) : x,y € R} is a subspace of R?. Find 
the dimension and a basis of the subspace. 


Find a basis for Span{u,v,w}, where u = (1,1,0), v = (1,3,-1), 
and w = (1,—1,1). Find the coordinate of (1,2,3) under the basis. 


Let W = {(21, 2, 2%3,2%4) € R*: 3 = 2, + xq and 24 = 21 — 29}. 
(a) Prove that W is a subspace of R*. 

(b) Find a basis for W. What is the dimension of W? 

(c) Prove that {c(1,0, 1,1): c € R} is a subspace of W. 

(d) Is {c(1,0,0,0) : c€ R} a subspace of W? 
Show that each of the following is a vector space over R. 


a) Cla, }], the set of all (real-valued) continuous functions on {a, }]. 


(c) The set of all even functions. 


(d) The set of all odd functions. 
(e) The set of all functions f that satisfy f(0) = 


(a) 

(b) C’(R), the set of all functions of continuous derivatives on R. 
) 
) 


[Note: Unless otherwise stated, functions are added and multiplied by 
scalars in a usual way, ie., (f+g)(x) = f(x)+9(2), (kf) (x) = kf (2). 


Show that if W is a subspace of vector space V of dimension n, then 
dimW <n. Is it possible that dim W = n for a proper subspace W? 
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16. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


Let {ui,...,us} and {v1,...,v4} be two sets of vectors. If s > t and 
each u; can be expressed as a linear combination of v1,..., v4, show 
that u,,...,Us are linearly dependent. 


Let V be a vector space over a field F. Show that cv = 0, where c € F 
and v € V, if and only if c= 0 or v = 0. [Note: The scalar 0 and the 
vector 0 are usually different. For simplicity, here we use 0 for both. 
In general, one can easily tell from the text which is which.] 


Let Vi and V2 be subspaces of a finite-dimensional space. Show that 
the sum V, + Vo is a direct sum if and only if 


dim(V; + V2) = dim V; + dim V9. 
Conclude that if {w1,...,us} is a basis for V; and {v1,...,u¢} is a 
basis for Vo, then {u1,...,Us,U1,---, Uz} is a basis for Vi @ Va. 


If Vj, Vo, and W are subspaces of a finite-dimensional vector space 
V such that Vi BW = V2 OW, is it always true that V; = Vo? 


Let V be a vector space of finite dimension over a field F. If V; and 
V2 are two subspaces of V such that dimV; = dim V2, show that 
there exists a subspace W such that V=V; BW =VW@W. 


Let V; and V2 be subspaces of a vector space of finite dimension such 
that dim(Vi + V2) = dim(V, N V2) +1. Show that Vi; C V2 or Vo C Vj. 


Let $1, So, and S3 be subspaces of a vector space of dimension n. 
Show that 


(S$, + S2)N ($1 + $3) = S1 + ($1 + $2) 9 S3. 


Let S,, S2, and S3 be subspaces of a vector space of dimension n. 
Show that 


dim(S; NS. S3) > dim $; + dim $5 + dim $3 — 2n. 
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1.2. Matrices and Determinants 


An m x n matrix A over a field F is a rectangular array of m rows 
and n columns of entries in F: 


a1 42 Gin 

a21 422 a2n 
A= ; 

Qm1 Gm2 .-. mn 


Such a matrix, written as A = (aj;;), is said to be of size (or order) 
m x n. Two matrices are considered to be equal if they have the 
same size and same corresponding entries in all positions. 

The set of all m x n matrices over a field F is a vector space with 
respect to matrix addition by adding corresponding entries and to 
scalar multiplication by multiplying each entry of the matrix by the 
scalar. The dimension of the space is mn, and the matrices with one 
entry equal to 1 and 0 entries elsewhere form a basis. In the case of 
square matrices; that is, m =n, the dimension is n?. 

We denote by Mmxn(F) the set of all m x n matrices over the 
field F, and throughout the book we simply write M,, for the set of 
all complex n-square (i.e., m x n) matrices. 

The product AB of two matrices A = (a;;) and B = (bj;) is 
defined to be the matrix whose (7, j)-entry is given by 


aib1; + aiaba3 + +++ + Gindn;- 


Thus, in order that AB make sense, the number of columns of A 
must be equal to the number of rows of B. Take, for example, 


i <2 345 
A=(4 : i B=(5 0 a 


—3 4 -3 
ap=( 7 0 eal 


Note that BA is undefined. 


Then 
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Sometimes it is useful to write the matrix product AB, with 
B = (by, bg,...,bn), where the 6; are the column vectors of B, as 


AB = (Aby, Abo, ..., Abn). 


The transpose of an m x n matrix A = (a;j) is an n X m matrix, 
denoted by A’, whose (i, j)-entry is aj;; and the conjugate of A is a 
matrix of the same size as A, symbolized by A, whose (i, j)-entry is 
ajj. We denote the conjugate transpose A of A by A*. 

The n x n identity matrix I,, or simply J, is the n-square matrix 
with all diagonal entries 1 and off-diagonal entries 0. A scalar matrix 
is a multiple of J, and a zero matrix 0 is a matrix with all entries 0. 
Note that two zero matrices may not be the same, as they may have 
different sizes. A square complex matrix A = (aj;) is said to be 


diagonal if ajj =0, 1 Fj, 
upper-triangular if aj; =0, i > J, 
symmetric if A? = A, 

Hermitian if A* = A, 

normal if A*A = AA*, 

unitary if A*A = AA* = TJ, and 
orthogonal if ATA = AAT =I. 


A submatriz of a given matrix is an array lying in specified subsets 
of the rows and columns of the given matrix. For example, 


1 2 
o=(31) 
37 
is a submatrix of 
0 1 2 
A=[{[i 3 j 


lying in rows one and two and columns two and three. 


If we write B = (0,1), D = (m), and E = (V3, —1), then 


A=(5 a) 
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The right-hand side matrix is called a partitioned or block form of A, 
or we say that A is a partitioned (or block) matrix. 

The manipulation of partitioned matrices is a basic technique 
in matrix theory. One can perform addition and multiplication of 
(appropriately) partitioned matrices as with ordinary matrices. 

For instance, if A, B, C, X, Y, U, V are n-square matrices, then 


A B X Y\  ( AX+BU AY+BV 
0 C uu yp CU CV ; 
The block matrices of order 2 x 2 have appeared to be the most 
useful partitioned matrices. We primarily emphasize the techniques 


for block matrices of this kind in this book. 
Elementary row operations for matrices are those that 


i. Interchange two rows. 
ii. Multiply a row by a nonzero constant. 


iii. Add a multiple of a row to another row. 


Elementary column operations are similarly defined, and similar 
operations on partitioned matrices are discussed in Section 2.1. 

An n-square matrix is called an elementary matrix if it can be 
obtained from I, by a single elementary row operation. Elementary 
operations can be represented by elementary matrices. Let E be 
the elementary matrix by performing an elementary row (or column) 
operation on I, (or I, for column). If the same elementary row 
(or column) operation is performed on an m x n matrix A, then 
the resulting matrix from A via the elementary row (or column) 
operation is given by the product EA (or AF, respectively). 

For instance, by elementary row and column operations, the 2 x 3 


matrix A = er 4 is brought into ie ~ . Write in equations: 


2 3 1 0 O 
Rafaks ( 5 5 OG =( 5 l ee 


where 
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and 


This generalizes to the so-called rank decomposition as follows. 


Theorem 1.2 Let A be anm xn matrix over a field F. Then there 
exist anm xm matric P and ann xn matric Q,both of which are 
products of elementary matrices with entries from F, such that 


rag= (5 ee (1.1) 


The partitioned matrix in (1.1), written as I, @0 and called a 
direct sum of I, and 0, is uniquely determined by A. The size r of 
the identity I, is the rank of A, denoted by rank (A). If A = 0, then 
rank (A) = 0. Clearly rank (A) = rank (A) = rank (A*) = rank (A). 

An application of this theorem reveals the dimension of the solu- 
tion space or null space of the linear equation system Ax = 0. 


Theorem 1.3 Let A be anmxn (real or complex) matriz of rank r. 
Let Ker A be the null space of A, i.e., Ker A= {x: Ax =0}. Then 


dim Ker A= n—r. 
A notable fact about a linear equation system is that 
Az=0 ifandonlyif (A*A)z =0. 


The determinant of a square matrix A, denoted by det A, or |A| 
as preferred if A is in a partitioned form, is a number associated with 
A. It can be defined in several different but equivalent ways. The 
one in terms of permutations is concise and sometimes convenient. 

We say a permutation p on {1,2,...,n} is evenif p can be restored 
to natural order by an even number of interchanges. Otherwise, p 
is odd. For instance, consider the permutations on {1,2,3,4}. The 
permutation p = (2,1,4,3); that is, p(1) = 2, p(2) = 1, p(3) = 4, 
p(4) = 3, is even because it will become (1, 2,3, 4) after interchanging 
2 and 1 and 4 and 3 (two interchanges), whereas (1,4,3,2) is odd, 
for interchanging 4 and 2 gives (1, 2, 3, 4). 
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Let S;,, be the set of all (n!) permutations on {1,2,...,n}. For 
p € Sy, define sign(p) = 1 if p is even and sign(p) = —1 if p is odd. 
Then the determinant of an n-square matrix A = (aj;) is given by 


det A = > sign(p) II Atp(t)- 
t=1 


pESn 


Simply put, the determinant is the sum of all (n!) possible “signed” 
products in which each product involves n entries of A belonging to 


different rows and columns. For n = 2, A = e AF det A = ad — be. 
The determinant can be calculated by the Laplace formula 


det A = So(-1) Yar det A(1|j), 
j=l 


where A(1|j) is a submatrix of A obtained by deleting row 1 and 
column 7 of A. This formula is referred to as the Laplace expansion 
formula along row 1. One can also expand a determinant along other 
rows or columns to get the same result. The determinant of a matrix 
has the following properties. 


a. The determinant changes sign if two rows are interchanged. 


b. The determinant is unchanged if a constant multiple of one row 
is added to another row. 


c. The determinant is a linear function of any row when all the 
other rows are held fixed. 


Similar properties are true for columns. Two often-used facts are 
det(AB) =detAdetB, A, BEM), 
and 


A B 
0 C 


[= act 4 detC, AEM,, CEM». 


A square matrix A is said to be invertible or nonsingular if there 
exists a matrix B of the same size such that 


AB = BA=lI. 
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Such a matrix B, which can be proven to be unique, is called the 
inverse of A and denoted by A~!. The inverse of A, when it exists, 
can be obtained from the adjoint of A, written as adj(A), whose (i, 7)- 
entry is the cofactor of aj;, that is, (—1)’*' det A(j|i). In symbols, 


a. i 


= 5 ail). (1.2) 


An effective way to find the inverse of a matrix A is to apply 
elementary row operations to the matrix (A,J) to get a matrix in 
the form (I, B). Then B = A~! (Problem 23). 

If A is a square matrix, then AB = J if and only if BA = J. 

It is easy to see that rank (A) = rank(PAQ) for invertible ma- 
trices P and Q of appropriate sizes (meaning that the involved op- 
erations for matrices can be performed). It can also be shown that 
the rank of A is the largest number of linearly independent columns 
(rows) of A. In addition, the rank of A is r if and only if there ex- 
ists an r-square submatrix of A with nonzero determinant, but all 
(r+ 1)-square submatrices of A have determinant zero (Problem 24). 


Theorem 1.4 The following statements are equivalent for A € My. 


. A is invertible, i.e, AB = BA=TI for some BE My. 
. AB=I (or BA=1) for some B € Mh. 
. A is of rank n. 


. A is a product of elementary matrices. 


1 
2 
a 
4 
5. Ax =0 has only the trivial solution x = 0. 
6. Ax =} has a unique solution for each b € C”. 
. det A #0. 
8. The column vectors of A are linearly independent. 
9 


. The row vectors of A are linearly independent. 
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Problems 


1 2 3 
1. Findtherankof | 4 5 6 | by performing elementary operations. 
0 


2 1 
2. Evaluate the determinants 
2 -3 10 1+2 1 1 
1 2 -2], 1 1l+y 1 
0 1 -8 1 1 1+2 


3. Show the 3 x 3 Vandermonde determinant identity 


1 1 1 
a1 a2 a3 | = (a2 — a,)(a3 — a1) (a3 — a2) 
2 2 29 


ay ag 43 
and evaluate the determinant 


1a @-—be 
1 b b-ca 
1 ¢ @—=ab 


4. Let A be an n-square matrix and k be a scalar. Show that 


det(kA) = k” det A. 


5. If A is a Hermitian (complex) matrix, show that det A is real. 
6. If A an n x n real matrix, where n is odd, show that A? 4 —I. 
7. Let A € M,,. Show that A* + A is Hermitian and A* — A is normal. 
8. Let A and B be complex matrices of appropriate sizes. Show that 
(a) AB=AB, 
(b) (AB)? = BTA’, 
(c) (AB)* = B*A*, and 
(d) (AB)~! = B-1A~? if A and B are invertible. 


9. Show that matrices (. a and ( *) are both symmetric, but one is 
normal and the other one is not normal. 


10. Find the inverse of each of the following matrices. 


1 a 0 i it 0 110 
010], |o11), 111 
0b 1 Dy 0. a 0 11 
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11. If a,b,c, and d are complex numbers such that ad—bc 4 0, show that 


a b\ 1 d —b 
c d ~ ad—bc\ -e a j° 


12. Compute for every positive integer k, 


Gee arr ceEe 


13. Show that for any square matrices A and B of the same size, 


A*A— B*B = =((A+B)*(A— B)+(A-B)*(A+B)). 


ND] ee 


14. If AB = A+B for matrices A, B, show that A and B commute, i-e., 
AB=A+B => AB=BA. 
15. Let A and B be n-square matrices such that AB = BA. Show that 
(A+ B)F = A* + pA 1p 4 MED pip? yo. BF, 
16. Let A be a square complex matrix. Show that 
T-A™ 1 =(I-A)I+A+A?+---+A™). 


17. Let A, B, C, and D be n-square complex matrices. Compute 


ADEN” gg (AB Db «BR 
AX A ia C D aft gh ‘ys 


18. Determine whether each of the following statements is true. 


The sum of Hermitian matrices is Hermitian. 
The product of Hermitian matrices is Hermitian. 


The sum of unitary matrices is unitary. 


) 
) 
) 
(d) The product of unitary matrices is unitary. 
) The sum of normal matrices is normal. 

) 


The product of normal matrices is normal. 


19. Show that the solution set to the linear system Ax = 0 is a vector 
space of dimension n— rank (A) for any m x n matrix A over R or C. 


16 


20. 
21. 


22. 


23. 


24. 


25. 
26. 


Elementary Linear Algebra Review Chap. 1 


Let A, BE M,. If AB = 0, show that rank (A) + rank (B) < n. 
Let A and B be complex matrices with the same number of columns. 
If Bx = 0 whenever Ax = 0, show that 
A 
rank (B) <rank(A), rank & = rank (A), 


and that B = CA for some matrix C. When is C invertible? 
Show that any two of the following three properties imply the third: 
(a) A= A*; (oa es At (c) A? =I. 

Let A,B € My. If B(A,I) = (J, B), show that B = A~!. Explain 
why A~!, if it exists, can be obtained by row operations; that is, if 
(A,I) row reduces to (J, B), 

then matrix B is the inverse of A. Use this approach to find 
a ee 
3.9 4 
1 5 8 
Show that the following statements are equivalent for A € M,. 


(a) PAQ = (% f for some invertible matrices P and Q. 


(b) The largest number of column (row) vectors of A that constitute 
a linearly independent set is r. 


(c) A contains an r x r nonsingular submatrix, and every (r + 1)- 
square submatrix has determinant zero. 


(Hint: View P and Q as sequences of elementary operations. Note 
that rank does not change under elementary operations.] 


Prove Theorem 1.4. 


Let A and B be n x n matrices. Show that for any n x n matrix X, 
rank ( ; ] ) > rank (A) + rank (B). 


Discuss the cases where X = 0 and X = J, respectively. 


© 
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1.3. Linear Transformations and Eigenvalues 


Let V and W be vector spaces over a field F. A map A: V + W is 
called a linear transformation from V to W if for allu,v €V,cEF 


A(u+v) = A(u) + A(v) 
and 
A(cv) = cA(v). 
It is easy to check that A: R?+4 R?, defined by 
A(x1,%2) = (%1 + 22,21 — 2X2), 


is a linear transformation and that the differential operator D, from 
C’la, b|, the set (space) of functions with continuous derivatives on 
the interval [a, b], to C[a, b], the set of continuous functions on |a, b], 
defined by 

df (x) 


dx ’ 


Dz(f) = fe C"[a, b], 


is a linear transformation. 
Let A be a linear transformation from V to W. The subset in W, 


Im(A) = {A(v): ve V}, 
is a subspace of W, called the image of A, and the subset in V, 
Ker(A) = {veEV: Atv) =0E Wh, 


is a subspace of V, called the kernel or null space of A. 


A A 


[ ae PPL 
. LOSE) 


Figure 1.3: Image and kernel 
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Theorem 1.5 Let A be a linear transformation from a vector space 
V of dimension n to a vector space W. Then 


dim Im(A) + dim Ker(A) = n. 


This is seen by taking a basis {u,,...,us} for Ker(A) and ex- 
tending it to a basis {uw ,...,Us,Ui,---, Vs} for V, where s+t =n. 
It is easy to show that {A(v1),...,A(v)} is a basis of Im(A). 

Given an m Xx n matrix A with entries in F, one can always define 
a linear transformation A from F” to F” by 


A(z) = Az, ceEF". (1.3) 


Conversely, linear transformations can be represented by matrices. 
Consider, for example, A : R? ++ R® defined by 


A(x, 2)" = (321, 241 + v2, —21 — 2x2)". 


Then A is a linear transformation. We may write in the form 


A(x) = Az, 
where 
3 0 
x=(%1,22)", A= 2. I 
—-1 -2 


Let A be a linear transformation from V to W. Once the bases for 
V and W have been chosen, A has a unique matrix representation 
A as in (1.3) determined by the images of the basis vectors of V 
under A, and there is a one-to-one correspondence between the linear 
transformations and their matrices. A linear transformation may 
have different matrices under different bases. In what follows we show 
that these matrices are similar when V = W. Two square matrices 
A and B of the same size are said to be similar if P~-'AP = B for 
some invertible matrix P. 

Let A be a linear transformation on a vector space V with a basis 
{u1,...,Un}. Since each A(u;) is a vector in V, we may write 


n 
Aly = > iti, BH Myon ogi, (1.4) 
j=l 
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and call A = (a;;) the matrix of A under the basis {u1,...,Un}- 
Write (1.4) conventionally as 


A(uz,...,Un) = (A(ui),...,A(un)) = (u1,..., Un) A. 


Let v EV. Ifv = a,u, +--+: + 2nUn, then 
Aly) = Si Alt) = (Ali ) 14>» Alig IG = (iio <2 ty AR, 
i=l 


where «x is the column vector (21,...,2n)*". In case of R” or C” with 
the standard basis wu; = e1,...,Un = En, we have 


A(v) = Aa. 


Let {v1,...,Un} also be a basis of V. Expressing each u; as a 
linear combination of v1,...,Un gives an n xX n matrix B such that 


(ui,..+,Un) = (Vigerss ye) 2: 


It can be shown (Problem 10) that B is invertible since {u1,...,un} 
is a linearly independent set. It follows by using (1.4) that 


A(v1,...,0n) = A((u1,...,tn)B7) 
= (u1,...,Un) AB 
= (v,...,Un)(BAB™"). 


This says that the matrices of a linear transformation under different 
bases {u1,...,Un} and {v1,...,U,} are similar. 

Given a linear transformation on a vector space, it is a central 
theme of linear algebra to find a basis of the vector space so that the 
matrix of a linear transformation is as simple as possible, in the sense 
that the matrix contains more zeros or has a particular structure. In 
the words of matrices, the given matrix is reduced to a canonical 
form via similarity. This is discussed in Chapter 3. 

Let A be a linear transformation on a vector space V over C. A 
nonzero vector v € V is called an eigenvector of A belonging to an 
eigenvalue » € C if 

Alu) = Av, o #0. 
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A(v) = Av, A >1 
a 


Figure 1.4: Eigenvalue and eigenvector 


If, for example, A is defined on R? by 


A(z, y) = (y, 2), 


then A has two eigenvalues, 1 and —1. What are the eigenvectors? 
If Ay and Ag are different eigenvalues of A with respective eigen- 
vectors 71 and x9, then x; and 2 are linearly independent, for if 


yay t+lorg = 0 (1.5) 
for some scalars /; and lz, then applying A to both sides yields 
A, x1 + loAgre = 0. (1.6) 
Multiplying both sides of (1.5) by A1, we have 
Gy Ayx1 + leA iro = 0. (1.7) 
Subtracting (1.6) from (1.7) results in 
lo(A1 — Ag) x2 = 0. 


It follows that l2 = 0, and thus 1; = 0 from (1.5). 
This can be generalized by induction to the following statement. 


If a;, are linearly independent eigenvectors corresponding to an 
eigenvalue ;, then the set of all eigenvectors a;, for these eigenvalues 
A; together is linearly independent. Simply put: 


Theorem 1.6 The eigenvectors belonging to different eigenvalues 
are linearly independent. 
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Let A be a linear transformation on a vector space V of dimen- 
sion n. If A happens to have n linearly independent eigenvectors be- 
longing to (not necessarily distinct) eigenvalues Xj, A2,...,An, then 
A, under the basis formed by the corresponding eigenvectors, has a 
diagonal matrix representation 


A 0 
AQ 
0 | Ag 
To find eigenvalues and eigenvectors, one needs to convert 
A(v) = Av 
under a basis into a linear equation system 
Ax = Ax. 
Therefore, the eigenvalues of A are those A € F such that 
det(AI — A) =0, 


and the eigenvectors of A are the vectors whose coordinates under 
the basis are the solutions to the equation system Ax = Az. 
Suppose A is an n x n complex matrix. The polynomial in A, 


pa(A) = det(Al, — A), (1.8) 


is called the characteristic polynomial of A, and the zeros of the 
polynomial are called the eigenvalues of A. It follows that every 
n-square matrix has n eigenvalues over C (including repeated ones). 

The trace of an n-square matrix A, denoted by tr A, is defined to 
be the sum of the eigenvalues Aj,..., An, of A, that is, 


trA =A, +--++An. 
It is easy to see from (1.8) by expanding the determinant that 


tr A =a +---+@nn 
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and : 
det A = II ri: 
i=l 
Let A be a linear transformation on a vector space V. Let W be 
a subspace of V. If for every w € W, A(w) € W, we say that W is 


invariant under A. Obviously {0} and V are invariant under A. It 
is easy to check that Ker A and Im J are invariant under A too. 


-— ae a= a 
mec 
LO ey 


Figure 1.5: Invariant subspace 


Let V be a vector space over a field. Consider all linear transfor- 
mations (operators) on V and denote the set by L(V). Then L(V) is 
a vector space under the following addition and scalar multiplication: 


(A+ B)(v) = A(v) + Biv), (kA)(v) = kA(v). 


The zero vector in L(V) is the zero transformation. And for every 
Ae L(V), —A is the operator (—.A)(v) = —(A(v)). For two opera- 
tors A and 6 on V, define the product of A and B by 


(AB)(v) = A(B(v)), ve. 
Then AB is again a linear transformation on V. The identity trans- 


formation T is the one such that Z(v) = v for allu EV. 


Problems 


1. Show that the map A from R® to R? defined by 


A(z, Y; z)=(a+y,2—y, z) 


is a linear transformation. Find its matrix under the standard basis. 
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2. Find the dimensions of Im(A) and Ker(A), and find their bases for 
the linear transformation A on R® defined by 


A(x, y, 2) = (w@— 22, y+ 2, 0). 


3. Define a linear transformation A : R? +> R? by 


A(z, y) = (y, 0). 


a) Find Im(A) and Ker(A). 
b) Find a matrix representation of A. 


(a) 
(b) 
(c) Verify that dim R? = dim Im(A) + dim Ker(A). 
(d) 
) 


d) Is Im(A) + Ker(.A) a direct sum? 
(e) Does R? = Im(A) + Ker(A)? 
4. Find the eigenvalues and eigenvectors of the differential operator Dz. 


5. Find the eigenvalues and corresponding eigenvectors of the matrix 


eee 


6. Find the eigenvalues of the matrix 


4 -2 -1 0 
—2 4 0 -1 
—-1l O 4 -2 
O -1l -—2 4 


A= 


7. Let » be an eigenvalue of A on a vector space V, and let 
Vi ={veV: A(v) =v}, 


called the eigenspace of A. Show that V) is an invariant subspace of 
V under A; that is, it is a subspace and A(v) € V) for every uv € Vy. 


8. Define linear transformations A and B on R? by 
A(z, y)=(e+y,y), Bz, y)=(e@+y, r-y). 


Find all eigenvalues of A and B and their eigenspaces. 
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10. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 


18. 
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. Let p(a) = det(aI — A) be the characteristic polynomial of matrix 


AéEM),. If X is an eigenvalue of A such that p(x) = (a — A)*q(x) for 
some polynomial q(x) with g(A) 4 0, show that 


[Note: Such a & is known as the algebraic multiplicity of A; dim V) 
is the geometric multiplicity of A. When we say multiplicity of A, we 
usually mean the former unless otherwise stated.] 


Let {wi,...,Un} and {v1,...,Un} be two bases of a vector space V. 
Show that there exists an invertible matrix B such that 


(u1,---,Un) = (U1,.--,Un)B. 


Let {u1,...,Un} be a basis for a vector space V and let {v1,..., vx} 
be a set of vectors in V. If vu; = D0;_ aiguj, i= 1,...,k, show that 


dim Span{v,,...,v,} =rank(A), where A = (a;;). 


Show that similar matrices have the same trace and determinant. 


Let v; and v2 be eigenvectors of matrix A belonging to different 
eigenvalues A; and Az, respectively. Show that v1 + v2 is not an 
eigenvector of A. How about av, + bug, a, b € R? 


Let A € M,, and let S € M,, be nonsingular. If the first column of 
S-1AS is (A,0,...,0)7, show that \ is an eigenvalue of A and that 
the first column of S$ is an eigenvector of A belonging to X. 


Let x € C”. Find the eigenvalues and eigenvectors of the matrices 


A, =xx* and A= (? (e 


If each row sum (i.e., the sum of all entries in a row) of matrix A is 
1, show that 1 is an eigenvalue of A. 


If \ is an eigenvalue of A € M,,, show that \? is an eigenvalue of A? 
and that if A is invertible, then \~! is an eigenvalue of A7?. 


If  € C” is an eigenvector of A € M,, belonging to the eigenvalue A, 
show that for any y € C”, A+ y*z is an eigenvalue of A+ xy”. 
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19. 


20. 


21. 


22. 


23. 


24. 


A minor of a matrix A € M.,, is the determinant of a square submatrix 
of A. Show that 


det(AI — A) = X” — 5A"! + 5g A"? — ++» + (-1)" det A, 


where 0; is the sum of all principal minors of order i, 7 = 1,2,...,n—1. 
[Note: A principal minor is the determinant of a submatrix indexed 
by the same rows and columns, called a principal submatriz.] 


A linear transformation A on a vector space V is said to be invertible 
if there exists a linear transformation 6 such that AB = BA = T, 
the identity. If dim V < oo, show that the following are equivalent. 


) 
) 
c) If {u1,...,Un} is a basis for V, then so is {Aui,..., Aun}. 
) A is one-to-one. 

) 

) 


Let A be a linear transformation on a vector space of dimension n 
with matrix representation A. Show that 


dimIm(A) =rank(A) and dimKer(A) =n -— rank (A). 


Let A and B be linear transformations on a finite-dimensional vector 
space V having the same image; that is, Im(A) = Im(B). If 


V = Im(A) 6 Ker(A) = Im(B) © Ker(B), 


does it follow that Ker(A) = Ker(B)? 


Consider the vector space F[z] of all polynomials over F(= R or Q). 
For f(x) = a@n2" + dn_1a"1 +--+ +a," + ao € Fiz], define 


S(f(x)) = at Stata" fo + Da? + aor 


and 


T(f(a)) = nage” * + (n—1)an_ya"? +--+ +44. 
Compute ST and 7S. Does ST = TS? 


Define P : C” + C” by P(x) = (0,0, 23,...,2n). Show that P is a 
linear transformation and P? =P. What is Ker(P)? 
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Let A be a linear transformation on a finite-dimensional vector space 
V, and let W be a subspace of V. Denote 
A(W) = {A(w): we WH. 
Show that A(W) is a subspace of V. Furthermore, show that 


dim(A(W)) + dim(Ker(.A) 9 W) = dim W. 


Let V be a vector space of dimension n over C and let {u1,..., Un} 
be a basis of V. For « = x7,u, +--+ 2nUy € V, define 


T (#) = (@1,.-.,2n) € C”. 
Show that 7 is an isomorphism, or T is one-to-one, onto, and satisfies 
T (ax + by) =aT(xz)+bT(y), wt, yEV, a, bEC. 
Let V be the vector space of all sequences 
(c1,C2,--.), G €C,+=1,2,.... 
Define a linear transformation on V by 
S(c1, €2,°++) = (0,1, €2,--°). 
Show that S has no eigenvalues. Moreover, if we define 
S* (Gi Gata) = (encase >) 


then S*S is the identity, but SS* is not. 


Let A be a linear operator on a vector space V of dimension n. Let 
Y=US, Ke), Vie= 2, im"). 


Show that Vo and V; are invariant under A and that V=V) 6@Vj. 
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1.4 Inner Product Spaces 


A vector space V over the number field C or R is called an inner 
product space if it is equipped with an inner product ( , ) satisfying 
for all u,v,w € V and scalar c, 

1. (u,u) > 0, and (u, uw) = 0 if and only if u = 0, 

2. (ut+v,w) = (u,w) + (v,w), 


C” is an inner product space over C with the inner product 
(z,y)=yt=Fitit--++nIn. 


An inner product space over R is usually called a Euclidean space. 
The Cauchy-Schwarz inequality for an inner product space is one 
of the most useful inequalities in mathematics. 


Theorem 1.7 (Cauchy—Schwarz Inequality) Let V be an inner 
product space. Then for all vectors x and y in V, 


(x,y)? < (w,2)(y,y). 
Equality holds if and only if x and y are linearly dependent. 


The proof of this can be done in a number of different ways. The 
most common proof is to consider the quadratic function in t¢ 


(cx+ty,x+ty) >0 


and to derive the inequality from the nonpositive discriminant. One 
may also obtain the inequality from (z, z) > 0 by setting 


c, £230, 
£ 


and showing that (z,x) = 0 and then (z, z) = (z,y) > 0. 
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A matrix proof is left as an exercise for the reader (Problem 13). 

For any vector x in an inner product space, the positive square 
root of (x,x) is called the length or norm of the vector xz and is 
denoted by ||z||; that is, 


lal = V (@, 2). 


Thus, the Cauchy—Schwarz inequality is rewritten as 
l(z,y)| < lll Ilyll- 
Theorem 1.8 For all vectors x and y in an inner product space, 
i. lal] > 0; ii. [lex] = ella], e€ Cj ii. |x + yl] < Ill] + [ly]. 


The last inequality is referred to as the triangle inequality. 

A unit vector is a vector whose length is 1. For any nonzero 
vector u, Taq is a unit vector. Two vectors x and y are said to 
be orthogonal if (x,y) = 0. An orthogonal set is a set in which any 
two of the vectors are orthogonal. Such a set is further said to be 
orthonormal if every vector in the set is of length 1. 

For example, {v1, v2} is an orthonormal set in R?, where 


( 1 1 ) ( 1 1 ) 

vy = | —=, =], v2o= | —~,-—= }. 

= hefa say Naja? x72 

The column (row) vectors of a unitary matrix are orthonormal. 


Let S be a subset of an inner product space V. Denote by St 
the collection of the vectors in V that are orthogonal to all vectors 
in S; that is, 


St={veV: (v,s)=0 forall s € S}. 
It is easy to see that S+ is a subspace of V. If S contains only 


one element, say x, we simply use z+ for S+. For two subsets S$; and 
So, if (x,y) = 0 for all x € S; and y € So, we write S)15b. 
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Figure 1.6: Orthogonality 


As we saw in the first section, a set of linearly independent vectors 
of a vector space of finite dimension can be extended to a basis for 
the vector space. Likewise a set of orthogonal vectors of an inner 
product space can be extended to an orthogonal basis of the space. 
The same is true for a set of orthonormal vectors. Consider C”, for 
instance. Let wu; be a unit vector in C”. Pick a unit vector we in 
ut ifn > 2. Then uw, and ug are orthonormal. Now if n > 3, let 
uz be a unit vector in (Spanfu,,u2})+ (equivalently, (w1,u3) = 0 
and (u2,u3) = 0). Then w1, ug, uz are orthonormal. Continuing this 
way, one obtains an orthonormal basis for the inner product space. 
We summarize this as a theorem for C”, which will be freely and 
frequently used in the future. 


Theorem 1.9 Jf u,,...,uz are k linearly independent column vec- 
tors in C", 1 < k <n, then there exist n — k column vectors 
Ukt1;+++)Un in C” such that the matria 
P= (tio. othe Why 45 Ue) 
is invertible. Furthermore, if ui,...,ux are orthonormal, then there 
exist unit n —k vectors Upsi,..-,Un in C” such that the matria 
U= Clee » +) Uk, UR+1; - i gt) 


is unitary. In particular, for any unit vector u in C”, there exists a 
unitary matrix that contains u as its first column. 


If {u1,..., Un} is an orthonormal basis of an inner product space 
V over C, and if x and y are two vectors in V expressed as 


L=XyUp+ ess +UpnUn, Ye Yur t+ + Ynun,; 
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then ty = (a, tie), ye — (9, ay) fore = Las. st, 

lel] = (lara? + + [arn |?) ? (1.9) 
and 

(x,y) =Y ci +--+ +n In. (1.10) 
For A € M, and with the standard basis e1,...,en of C", we have 


trA= 5 (Aei, ei) 


i=1 
and for z € C” 
(Anan =2Ar= 3 hep Dele: 
i,j=l 
Upon computation, we have 
(Ax, y) = (a, A*y) 
and, with ImA = {Az: c €C"} and Ker A= {xz € C": Ax = 0}, 
Ker A* = (Im A)*, Im A* = (Ker A)t. (1,17) 
M,, is an inner product space with the inner product 

(A, B),, = tr(B*A), A, BE Mh. 

It is immediate by the Cauchy—Schwarz inequality that 
| tr(AB)|? < tr(A*A) tr(B*B) 

and that 

tr(A*A) =0 if and only if A=0. 


We end this section by presenting an inequality of the angles 
between vectors in a Euclidean space. This inequality is intuitive 
and obvious in R? and R°. The good part of this theorem is the idea 
in its proof of reducing the problem to R? or R?. 

Let V be an inner product space over R. For any nonzero vectors 
x and y, define the (measure of) angle between x and y by 


-1 es y) 
Il [I lal 


Lay =C 


Sec. 1.4 Inner Product Spaces 31 


Theorem 1.10 For any nonzero vectors x, y, and z in a Euclidean 
space (t.e., an inner product space V over R), 


L,z < x,y at Y,2° 
Equality occurs if and only if y = ax + bz, a,b > 0. 


Proof. Because the inequality involves only the vectors x, y, and z, 
we may focus on the subspace Span{z, y, z}, which has dimension at 
most 3. We can further choose an orthonormal basis (a unit vector 
in the case of dimension one) for this subspace. Let x, y, and z have 
coordinate vectors a, 3, and y under the basis, respectively. Then 
the inequality holds if and only if it holds for real vectors a, 6, and 
y, due to (1.9) and (1.10). Thus, the problem is reduced to R, R?, or 
IR? depending on whether the dimension of Span{z, y, z} is 1, 2, or 
3, respectively. For R, the assertion is trivial. For R? or R®, a simple 
graph will do the job. & 


Problems 


1. If V is an inner product space over C, show that for z, ye V,cEC, 
(a, cy) =e(2,y) and (a, y)(y, x) =|(z, y))’. 


2. Find all vectors in R? (with the usual inner product) that are orthog- 
onal to (1,1). Is (1,1) a unit vector? 


3. Show that in an inner product space over R or C 
(a, y)=0 = |je@t+yll? = lel? + llyll? 


and that the converse is true over R but not over C. 


4. Let V be an inner product space over R and let x, y € V. Show that 
Iz =llyl > @t+y,2-y)=0. 


5. Show that for any two vectors x and y in an inner product space 


llc — yll? + [le + yll? = 2(l|aI]? + [lyll?). 


6. Is[, ] defined as [x, y] = x1y1 +--+ +%nYyn an inner product for C”? 
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. For what diagonal D € M, is [x, y] = y* Da an inner product for C”? 


. Let {u1,...,Un} be an orthonormal basis of an inner product space 


V. Show that for « € V 
(di2)=0; tH lan, & w= 0, 
and that for z, ye V 


(wiv) = (ui, y), i=1,...,n, ° c= Yy.- 


. Let {v1,...,Un} be a set of vectors in an inner product space V. 


Denote the matrix with entries (v;,v;) by G. Show that det G = 0 if 
and only if v1,...,Un are linearly dependent. 


Let A be an n-square complex matrix. Show that 
tr(AX) =0 forevery XCM, @ A=0. 
Let A € M,,. Show that for any unit vector x € C", 
|a* Ax|? < 2* A* Az. 
Let A = (a;;) be a complex matrix. Show that 


tr(A*A) = tr(AA*) = Sail”. 
tg 


Use the fact that tr(A*A) > 0 with equality if and only if A = 0 to 
show the Cauchy—Schwarz inequality for C” by taking A = ry*—yz2”*. 


Let A and B be complex matrices of the same size. Show that 


| tr(A*B)| < (tr(A*A) tr(B*B))”? < 5(tr(4"A) + tr(B*B)). 


If A and B are both n-square, then tr(A*B) on the left-hand side 
may be replaced by tr(AB), even though tr(A*B) ¢ tr(AB). Why? 


Let A and B be n-square complex matrices. If, for every x € C”, 
(Az, x) = (Ba, x), 
does it follow that A = B? What if « € R"? 
Show that for any n-square complex matrix A 
C” =ImA@ (Im A)* = ImA@ Ker A* 
and that C” =Im A @ Ker A if and only if rank (A?) = rank (A). 
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17. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


Let 0; and A; be positive numbers and 57i_, 6; = 1. Show that 


i (oa) (do0art), 
i=1 i= 


Let {ui,...,Un} be an orthonormal basis of an inner product space 
V. Show that 71, ..., Z» in V are pairwise orthogonal if and only if 
n 
S (zi, ux) (aj, Ue) =0, if j. 
k=1 
If {u1,..., ux} is an orthonormal set in an inner product space V of 


dimension n, show that k <n and for any « € V, 
k 
lll? > So |, a)? 
i=l 


Let {u1,...,tn} and {v1,...,Un,} be two orthonormal bases of an 
inner product space. Show that there exists a unitary U such that 


(U,---+)Un) = (U1,---, Un )U. 


(Gram-Schmidt Orthonormalization) Let 21, 22,...,%n, be linearly 
independent vectors in an inner product space. Let y; = x; and de- 
fine z1 = ||y1||~1y1; let yo = x2 — (x2, 21)21 and define z2 = |lyo||~tye. 
Then 21, 22 are orthonormal. Continue this process inductively: yz = 
Di — (Ley Ze—1) Ze-1— (He, Ze—2) Ze—2 —* ** — (Le, 21) 21, Ze = Iyell-*Ye- 
Show that the vectors 21, 22,..., Zn, are orthonormal. 

Prove or disprove for unit vectors u,v, w in an inner product space 


I(u, w)] S |(u, v)| + |(v, w)]. 


Let V be an inner product space and ||- || be the induced norm, that 
is, |lul] = /(u,u), u € V. Show that for vectors x and y in V, 


Ila + yl] = [lel] + IlyIl 
if and only if one of the vectors is a nonnegative multiple of the other. 
If the angle between nonzero vectors x and y in an inner product 


space over R or C is defined by <z,y= cos! ey , show that «Ly if 


and only if <,.,= %. Explain why the law of cosines does not hold. 
WY 2 y 
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Let V be a (not necessarily inner product) vector space over a field 
F (=R or C). We say that V is a normed space if it is equipped with 
a function |j - || : V+ R, called a vector norm, satisfying 


(i) [lal] 20, Gi) [lca] = lel lla], Gai) fla + all flail + Ila, 
for all z,y € V, c€ F, and ||| = 0 if and only if = 0. Show that 


(a) If > 0 in V = R” entrywise, then |||] > 0. 
(b) | Mlell — Ilyll | < lle — yl for all vectors # and y in V. 


(c) Give an example showing it is possible that ||a+y|]| = ||| +|lyll 
for some vectors x and y that are linearly independent. 


Let V; and V3 be subsets of an inner product space V. Show that 
Uc > Ve CVY. 

Let V; and V2 be subspaces of an inner product space V. Show that 
(Vi + V2)~ = Vir Vr 


and 
(Vin Va)t = Vit + V5). 


Let V; and V2 be subspaces of an inner product space V of dimension 
n. If dim V; > dim V3, show that there exists a subspace V3 such that 


V3 CV,, V3lV2, dimV3 > dim VY, — dim V2. 
Give a geometric explanation of this in R®. 
Let A and B be m x n complex matrices. Show that 


ImAlImB © A*B=0. 


Let A be an n-square complex matrix. Show that for any x, y € C” 
4(Az, y) = (As, s) ~ (At, t) a i(Au, u) ~~ i(Av, v), 


wheres=a+y,t=x-y,u=a2+ily, andv=a2-—iy. 


Let u be a nonzero vector in an inner product space V. If v1, v2,..., Ug 
are vectors in V such that (i) (v;,u) > 0 for all ¢ and (ii) (v;,v;) < 0 
whenever i 4 j, show that v1, v2,...,U, are linearly independent. 
Show that for any nonzero vectors x and y in C” 


1 
lz — yll = 5 (lel + iyi) y 
2 lly 


1 i | 
x ° 
[aI 
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CHAPTER 2 


Partitioned Matrices, Rank, and Eigenvalues 


Introduction: We begin with the elementary operations on parti- 
tioned (block) matrices, followed by discussions of the inverse and 
rank of the sum and product of matrices. We then present four 
different proofs of the theorem that the products AB and BA of 
matrices A and B of sizes m x n and n x m, respectively, have the 
same nonzero eigenvalues. At the end of this chapter we discuss the 
often-used matrix technique of continuity argument and the tool for 
localizing eigenvalues by means of the GerSgorin discs. 


2.1 Elementary Operations of Partitioned Matrices 


The manipulation of partitioned matrices is a basic tool in matrix 
theory. The techniques for manipulating partitioned matrices resem- 
ble those for ordinary numerical matrices. We begin by considering 


a b 
¢ ae a, b, c, dEC. 


An application of an elementary row operation, say, adding the sec- 
ond row multiplied by —3 to the first row, can be represented by the 


a2 x 2 matrix 


F. Zhang, Matrix Theory: Basic Results and Techniques, Universitext, 35 
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matrix multiplication 


Co PCS aa a). 


Elementary row or column operations for matrices play an impor- 
tant role in elementary linear algebra. These operations (Section 1.2) 
can be generalized to partitioned matrices as follows. 


I. Interchange two block rows (columns). 


II. Multiply a block row (column) from the left (right) by a non- 
singular matrix of appropriate size. 


If. Multiply a block row (column) by a matrix from the left (right), 
then add it to another row (column). 


Write in matrices, say, for type II elementary row operations, 


A B ae A B 

C D C+XA D+XB }’ 
where A€ M,,, DE M,, and X is n x m. Note that A is multiplied 
by X from the left (when row operations are performed). 


Generalized elementary matrices are those obtained by applying 
a single elementary operation to the identity matrix. For instance, 


O° Ln al Im 0 
t, We ae 
are generalized elementary matrices of type I and type II. 


Theorem 2.1 Let G be the generalized elementary matrix obtained 
by performing an elementary row (column) operation on I. If that 
same elementary row (column) operation is performed on a block 
matrix A, then the resulting matrix is given by the product GA (AG). 


Proof. We show the case of 2 x 2 partitioned matrices Because 
we deal with this type of partitioned matrix most of the time. An 
argument for the general case is similar. 

Let A, B, C, and D be matrices, where A and D are m- and 
n-square, respectively. Suppose we apply a type III operation, say, 
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adding the first row times an n x m matrix EF from the left to the 
second row, to the matrix 
A B 
(4 2), a 


Then we have, by writing in equation, 
A B _f. ig, 0 A B : 
C+EA D+EB/) \E IL C Di)}- 

As an application, suppose that A is invertible. By successively 
applying suitable elementary row and column block operations, we 
can change the matrix (2.1) so that the lower-left and upper-right 
submatrices become 0. More precisely, we can make the lower-left 
and upper-right submatrices 0 by subtracting the first row multi- 


plied by CA~! from the the second row, and by subtracting the first 
column multiplied by A~!B from the second column. In symbols, 


AB\ (A B A 0 
CD 0 D-cA'B})’\ 0 D—CA-'B }? 


and in equation form, 


Ls 0 A B iL. =A“ 8 
aU A-* id. C D 0 Tn, 


~ ( ‘ D 28 ) 22) 


Note that by taking determinants, 
| A B 


_ _ay-l 
C p [=a det(D—CA~“B). 


The method of manipulating block matrices by elementary oper- 
ations and the corresponding generalized elementary matrices as in 
(2.2) is used repeatedly in this book. 

For practice, we now consider expressing the block matrix 


(4 2) 29 
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as a product of block matrices of the forms 


(or) Gr) 


In other words, we want to get a matrix in the above form by per- 
forming type III operations on the block matrix in (2.3). 
Add the first row of (2.3) times A~! to the second row to get 


A B 
i Ata A BT 


Add the second row multiplied by J — A to the first row to get 


i Ata ees 
I A7!+A71B 


Subtract the first row from the second row to get 


I A't+A™'1B-I 
0 I ; 


which is in the desired form. Putting these steps in identity, we have 


(2 )0 TT )Cae ir )Oo a) 


Tl A*LA BT 
0 I 


Therefore, 
A B 7 i O\ fi i454 
gq A - Am fF 0 I 
.( 1 0 =e Pp feta. Af 
a7 7 0 rT 
7 0 ? At 4A ef 
I 0 ‘3 


is a product of type III generalized elementary matrices. 


l| 

JO 
| 

NN BN 

i 
NO 
a 
a 
ON 
aN 
~ | 
i 
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Problems 


1. 


Let E = E[i(c) + J] denote the elementary matrix obtained from I, 
by adding row 7 times c to row j. 

(a) Show that E* = E[j(é) > i). 

(b) Show that E~-! = Efi(—c) > jj]. 


(c) How is FE obtained via an elementary column operation? 


. Show that 


A(B, C) = (AB, AC) for A € Mimxn, B € Maxp, C € Max} 


A AC 
& )e- ( me) for A € Mpxn, B € Mgxn, C € Mnxm: 


. Let X be any n x m complex matrix. Show that 


c, @\ > f-d. « 
xX I, -_ —-X I, : 


. Show that for any n-square complex matrix X, 


xi _ fe th 
I, 0 7a, =o 


Does it follow that 


i ie” 7 oe 
I, O ~\ TT, 0 : 


. Show that every 2 x 2 matrix of determinant 1 is the product of some 


matrices of the following types, with y 4 0: 


(29) (oi) Go) Gr) Ga) 


. Let X and Y be matrices with the same number of rows. Multiply 


(a o)G a) om Geo) 0) 


. Let X and Y be complex matrices of the same size. Verify that 


Fee way ._ ft A 1 ¥\_ 
Ly Tay sy = hee 7 a oe p= 


€ gs = (TF) ernst (y.) Gn, 


l| 


AO 


10. 


11. 


12. 
13. 


14. 
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. Let a1, a2,..., Gp, be complex numbers, a = (—az,...,—Gn), and 


A=( 0 Fea 1 
—-a1 a 


Find det A. Show that A is invertible if a, 4 0 and that 


. Let a1, a2,..., @, be nonzero complex numbers. Find 
0 my 0 ua. 0 > 
0 0 ag aoe 0 
0 O O An—1 
ad, O O 0 


Show that the block matrices es - ) and Ge a) commute. 


or 
the product of the same type of elementary matrices with only one 
nonzero off-diagonal entry. [Hint: See how to get (i,7)-entry 2;; in 
the matrix by a type iii elementary operation from Section 1.2.] 


Show that a generalized elementary matrix c 7) can be written as 


-1 
Let A and B be nonsingular matrices. Find G S) ‘ 


Let A and B be m- and n-square matrices, respectively. Show that 


A «\*_ Ak x 
0 B 7 @ BF}? 


where the « are some matrices, and that if A and B are invertible, 


A +#\ ff A « 
0 B > 0 Bl} 
Let A and B be n x n complex matrices. Show that 
0 A a 
| BO | = (—1)" det Adet B 


and that if A and B are invertible, then 


oa oe 
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15. 


16. 


17. 
18. 


19. 


20. 


Let A and B be n x n matrices. Apply elementary operations to 
fe: ‘) to get a ), Deduce det(AB) = det A det B. 


Let A and B be n x n matrices. Apply elementary operations to 


(3 :) to get Gn t) and é ie) Conclude that J— AB and 


I — BA have the same rank for any A, B € My. 
Let A, B€ M,. Show that e za is similar to Ge re 


Let A and B be n x n matrices. Apply elementary operations to 
G 4) to get Ges z and derive the rank inequality 


rank (A + B) < rank (A) + rank (B). 
Let A be a square complex matrix partitioned as 
Air Aig ) 
A= » Ar €M,,, Aso € My. 
( As, Ass 11 22 


Show that for any BEM, 


ee BA» =det B det A 


Aat Ago 


and for any n x m matrix C 


= det A. 


Ai Aj2 
Aogy + CA, Ago +CAi2 


Let A be a square complex matrix. Show that 
a e =1-S°MiM, + 5° M3M2—S > M3M34+---, 
A 


where each M;, is a minor of order k = 1,2,.... [Hint: Reduce the 
left-hand side to det(J — A* A) and use Problem 19 of Section 1.3.] 
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2.2 The Determinant and Inverse of Partitioned Matrices 


Let M be a square complex matrix partitioned as 
A B 
M= 
where A and D are m- and n-square matrices, respectively. We dis- 
cuss the determinants and inverses of matrices in this form. The re- 
sults are fundamental and used almost everywhere in matrix theory, 


such as matrix computation and matrix inequalities. The methods 
of continuity and finding inverses deserve special attention. 


Theorem 2.2 Let M be a square matrix partitioned as above. Then 
det M = det A det(D—CA™'B), if A is invertible, 


and 
det M =det(AD-—CB), if AC=CA. 


Proof. When A~! exists, it is easy to verify (see also (2.2)) that 


i, % A B\ [A B 
“CA! 4, CD) 0 Deca BT 


By taking determinants for both sides, we have 


A B 
ede | 0 Baca ey 
det A det(D—CA™'B). 


For the second part, if A and C commute, then A, B, C, and D 
are of the same size. We show the identity by the so-called continuity 
argument method. 

First consider the case where A is invertible. Following the above 
argument and using the fact that 


det(XY) = det X det Y 
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for any two square matrices X and Y of the same size, we have 


det M = det Adet(D—CA7'B) 
= det(AD— ACA™'B) 
= det(AD—CAA"B) 
= det(AD-CB). 


Now assume that A is singular. Since det(A + e/) as a polynomial 
in € has a finite number of zeros, we may choose 6 > 0 such that 


det(A + «I) #0 whenever 0 <€ < 6; 


that is, A + eI is invertible for all « € (0,6). Denote 


{ae 2B 
men (440 BY, 


Noticing further that A+ «J and C commute, we have 
det M, = det((A+¢1)D—CB) whenever 0 <«€< 0. 


Observe that both sides of the above equation are continuous func- 
tions of €. Letting « > 07 gives that 


det M = det(AD—CB). O 


Note that the identity need not be true if AC A CA. 
We now turn our attention to the inverses of partitioned matrices. 


Theorem 2.3 Suppose that the partitioned matri« 
A B 
u=(6 5) 
is invertible and that the inverse is conformally partitioned as 


xX Y 
= 
7 aes — 


where A, D, X, and V are square matrices. Then 


det A = det V det M. (2.4) 


44 Partitioned Matrices, Rank, and Eigenvalues Chap. 2 


Proof. The identity (2.4) follows immediately by taking the deter- 
minants of both sides of the matrix identity 


A B Dy). Ff AD 7 
C D OV) LCT, 
Note that the identity matrices J in the proof may have different 
sizes. By Theorem 2.3, A is singular if and only if V is singular. 


Theorem 2.4 Let M and M7! be as defined in Theorem 2.3. If A 
is a nonsingular principal submatrix of M, then 


At A B DCA By CA, 
=A" BID =CA“ By, 

“p-OA 8) CA. 

= (D-CA Ry. 


so 
] 


Proof. As we know from elementary linear algebra (Theorem 1.2), 
every invertible matrix can be written as a product of elementary 
matrices; so can M~!. Furthermore, since 


M-1\(M,1I) =(1, M7), 


this says that we can obtain the inverse of M by performing row 
operations on (M,I) to get (I, M~') (Problem 23, Section 1.2). 
We now apply row operations to the augmented block matrix 


A BI 0 
CDOT) 
Multiply row 1 by A7! (from the left) to get 
I A‘B At 0 
C D 0 J /)° 
Subtract row 1 multiplied by C from row 2 to get 


I A-'B Am. Gj 
0 D-CA 1B -CA I }° 
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Multiply row 2 by (D — CA7!B)~! (which exists; why?) to get 


I AB A-? 0 
0 IF —-(D-—CA1B)!CA! (D-—CA"B)" }° 


By subtracting row 2 times A~!'B from row 1, we get the inverse of 
the partitioned matrix M in the form 


A-14+ A1B(D—CAB)"!CA-! —A-!B(D—CA™1B)"! 
—(D—CA1B)"!CA“1 (D—CA"B) 


Comparing to the inverse M~! in the block form, we have X, Y, U, 
and V with the desired expressions in terms of A, B,C, and D. UW 


A similar discussion for a nonsingular D implies 
Res(A]— Bp oy, 


It follows that 


(A—BD-'C)-1=A14+A'B(D-—CA"B)'CA. (25) 


Below is a direct proof for (2.5). More similar identities are de- 
rived by means of partitioned matrices in Section 6.4 of Chapter 6. 


Theorem 2.5 Let A € Mn and B € My, be nonsingular matrices 
and let C and D be mx n and n x m matrices, respectively. If the 
matrit A+ CBD is nonsingular, then 


(APCBD)t=A + ACB 4+ PAC) DA. (25) 


Proof. Note that B~!+ DA7!C is nonsingular, since (Problem 5) 


det(B- + DAC) det B~! det(I, + BDA™'!C) 
= detB' det(Im +A !CBD) 


det B~! det A! det(A + CBD) £0. 


We now prove (2.6) by a direct verification: 
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(A+ CBD)(A™! A eh DAC)"'DA™) 
=i,-C(R' £ pA "Cy" DA Steep A 
=CRDA“*C(B* + DAC) pa 
= fae c((B! LDA“ OC) 8B 
BDA OB 4+ DAC)")DA 
= Im — C(UIn + BDA'C)(B-! + DAC)! — B)DA™ 
= Im — c(B(B"! 4+ DA“ C)\(B 4 DAG t= B)DA™ 
=iy.—Ce=B) pA 
=Im. 


A great number of matrix identities involving inverses can be de- 
rived from (2.6). The following two are immediate when the involved 
inverses exist: 


(AYR) =A =A eA a 


and 


(A+UV*)1=A1-AtUTI4+V*A TU) VTA, 


Problems 


1. Let A be an n x n nonsingular matrix, a € C, a7, @ € C”. Prove 


a a 


(det A)~! BOA 


=a-—aA's. 


2. Refer to Theorem 2.2 and assume AC’ = C’'A. Does it follow that 
A B 
C D 


= det(AD — BC)? 


3. For matrices A, B, C of appropriate sizes, evaluate the determinants 
A In 0 A 0 A 
Im O |}? A |? BCr| 
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4. 


10. 


Let A, B, and C be n-square complex matrices. Show that 


fee 


—. | = det(C — BA). 


. Let A and B be m x n and n x m matrices, respectively. Show that 


A In 


In B 
B dp, 


-| i, A 


and conclude that 
det (Im — AB) = det(I, — BA). 
Is it true that 


rank (I,, — AB) = rank ([,, — BA)? 


. Can any two of the following expressions be identical for general 


complex square matrices A, B, C, D of the same size? 


det(AD — CB), det(AD— BC), det(DA—CB), det(DA-— BC), 


A B 
C DT 


. If A is an invertible matrix, show that 


rank ( - = ) = rank (A) + rank (D—CA7'B). 


In particular, 


In In = 
rank ( yoy ) =n-+rank(X —Y). 


. Does it follow from the identity (2.4) that any principal submatrix 


(A) of a singular matrix (IM) is singular? 


. Find the determinant and the inverse of the 2m x 2m block matrix 


re ( ae pris Ne al Be BO. 


If U is a unitary matrix partitioned as U = (: 2) , where u € C, 


show that |u| = | det U;|. What if U is real orthogonal? 


A8 


11. 


12. 


13; 


14. 


15. 


16. 
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Find the inverses, if they exist, for the matrices 


(70) (yz): 


Let A be an n-square nonsingular matrix. Write 
A=B+iC, A1=F+i6G, 


where B, C, F, G are real matrices, and set 
B -C 
— ( C OB ) 


Show that D is nonsingular and that the inverse of D is 


F —-G 
G F , 
In addition, D is normal if A is normal, and orthogonal if A is unitary. 


Let A and C be m- and n-square invertible matrices, respectively. 
Show that for any m x n matrix B and n x m matrix D, 


det(A + BCD) = det A det C det(C~' + DA~'B). 


What can be deduced from this identity if A = I, and C = I,,? 


Let A and B be real square matrices of the same size. Show that 


are 


a | = | det(A +2B)/*. 


Let A and B be complex square matrices of the same size. Show that 
A B 
| BA | = det(A + B) det(A — B). 
Also show that the eigenvalues of the 2 x 2 block matrix on the left- 
hand side consist of those of A+ B and A— B. 


Let A and B be n-square matrices. For any integer k (positive or 
negative if A is invertible), find the (1,2) block of the matrix 


(7). 
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I: 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


25. 


Let B and C be complex matrices with the same number of rows, 
and let A = (B,C). Show that 


= det(CC*) 


I Bb 
B AA* 


and that if C*B = 0 then 
det(A* A) = det(B* B) det(C*C). 


Let A € M,,. Show that there exists a diagonal matrix D with diago- 
nal entries +1 such that det(A + D) 4 0. [Hint: Show by induction.] 


If I+ A is nonsingular, show that (I+ A)~' and J— A commute and 


(+A) 1+(1+ Al) 1=1. 
Show that for any m x n complex matrix A 
(I+ A*A)“1A*A = A*A(I + A* A)“ = 71-14 A* A). 


Let A and B bem x n and n x m matrices, respectively. If I, + BA 
is nonsingular, show that J, + AB is nonsingular and that 


(L, + BA)'B=B(I,+ABy"*. 


Let A and B be m x n and n xX m matrices, respectively. If the 
involved inverses exist, show that 


(I— AB)"* =I+ A(I— BA)'B. 
Conclude that if J— AB is invertible, then so is J— BA. In particular, 
(I + AA*)-1 =I -— A(T + A*A)-1A*. 
Let A € M,, and a, 8 € C. If the involved inverses exist, show that 
(al — Ay“! — (61 - A)~) = (8 - 0) (al — A)“1(6T - A)“. 
Show that for any xz, y € C” 
adj — ry*) =azy*+(1—-y*a)l. 
Let wu, v € C” with v*u 4 0. Write v*u = p-! + q!. Show that 


(I — puv*)~1 = I — quo’. 
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26. Let uw and v be column vectors in C” such that v* A7tu is not equal 
to —1. Show that A+ uv” is invertible and, with 6 =1+v*A7!u, 


(A+ue")-)=A 1-6 1A w* AT. 
27. Let M = fe a . Assume that the inverses involved exist, and denote 
S=D-CA™'B, T=A-—BD GC. 


Show that each of the following expressions is equal to M~!. 


A-'+ABS"!CA71 —A7'BS7! 
(a) —S-!cA-1 s-t : 


-T1BD"1 
—D ee L Do+D" OT BD 


—A'BS"1 ) 


© ( 
© (_» fo roe 
o ( 
© ( 


(B- Te ae 


SA")(45 2 )(de 3) 


(£) ee -_ ) fe ( ie ) seart,-2) 


28. Deduce the following inverse identities from the previous problem. 


(C - ae LAY 7 
i 


(A—BD-'c)"' A++A‘'B(D-—CA™"'B)"'CcA 
= -C'D(B- ACD)" 

= -(C-—DB'A)'DB" 

= C0“ DD=-CA “By CA" 

= A ‘B(D-—CA"B) DB". 
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2.3. The Rank of Product and Sum 


This section is concerned with the ranks of the product AB and the 
sum A+ B in terms of the ranks of matrices A and B. 
Matrix rank is one of the most important concepts. In the previ- 


ous chapter we defined the rank of a matrix A to be the nonnegative 
T, 0 
00)? 


tary operations on A. The rank of matrix A, denoted by rank (A), 
is uniquely determined by A. rank (A) = 0 if and only if A = 0. 
The rank of a matrix can be defined in many different but equiv- 
alent ways (see Problem 24 of Section 1.2). For instance, it can be 
defined by row rank or column rank. The row (column) rank of a 
matrix A is the dimension of the vector space spanned by the rows 
(columns) of A. The row rank and column rank of a matrix are 
equal. Here is why: let the row rank of an m x n matrix A be r 
and the column rank be c. We show r = c. Suppose that columns 
Ci,Co,...,C. of A are linearly independent and span the column 
space of A. For the jth column A; of A, 7 = 1,2,...,n, we can write 


number r in the matrix ( which is obtained through elemen- 


A; = dyj;Cy + dy;C2 Sa dejCe = Cd;, 


where C' = (C1, C2, es Ga) and d; = (diz, doz; ies Ae wee 

Let D = (dy) = (di, do,...,d,). Then A= CD and Dis cx n. 
It follows that every row of A is a linear combination of the rows of 
D. Thus, r<c. A similar argument on A” shows c <r. 

We can also see that row rank equals column rank through the 
relation that the dimension of the column space, i.e., Im A, is the rank 
of A. Recall that the kernel, or the null space, and the image of an 
m Xn matrix A, viewed as a linear transformation, are, respectively, 


KerA={zxeC”": Ar=0}, ImA={Azr: reEC"}. 


Denote the row and column ranks of matrix X by rr(X) and cr(X), 
respectively. Then cr(X) = dimIm X. Since Ax = 0 © A* Ax =0, 


cr(A) = cr(A*A) = dimIm(A*A) < dimIm(A*) = cr(A’*). 
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Hence cr(A*) < cr((A*)*) 


= cr(A). So er(A) = cr(A*). However, 
cr(A) = rr(A™); we have cr(A) = 


er(A*) = rr(A) =rr(A) (over C). 


Theorem 2.6 (Sylvester) Let A and B be complex matrices of 
sizesm Xn and n x p, respectively. Then 


rank (AB) = rank (B) — dim(Im BN Ker A). (2.7) 
Consequently, 
rank (A) + rank (B) — n < rank (AB) < min{rank (A), rank (B)}. 


Proof. Recall from Theorem 1.5 that if A is a linear transformation 
on an n-dimensional vector space, then 


dim Im(.A) + dim Ker(.A) = n. 
Viewing A as a linear transformation on C”, we have (Problem 1) 
rank (A) = dim Im A. 


For the rank of AB, we think of A as a linear transformation on the 
vector space Im B. Then its image is Im(AB) and its null space is 
Im BM Ker A. We thus have 


dim Im(AB) + dim(Im BN Ker A) = dimIm B. 


The identity (2.7) then follows. 
For the inequalities, the second one is immediate from (2.7), and 
the first one is due to the fact that 


dimIm A+ dim(Im Bn Ker A) <n. Uf 
For the product of three matrices, we have 
rank (ABC) > rank (AB) + rank (BC) — rank (B). (2.8) 


A pure matrix proof of (2.8) goes as follows. Note that 


0 X 
> 
rank ( YZ ) > rank (X) + rank (Y) 
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for any matrix Z of appropriate size, and that equality holds if Z = 0. 
The inequality (2.8) then follows from the matrix identity 


I-A 0 AB i 0) /( =48C 0 

0 TL BC B —-C I) 0 Bi}? 
Theorem 2.7 Let A and B bemxn matrices and denote by C and 
D, respectively, the partitioned matrices 


C=. LS: aC 
Then 
rank(A+B) = rank(A)+rank(B) — dim(KerC nN Im D) 
— dim(Im A* nN Im B*). (2.9) 
In particular, 
rank (A+ B) < rank (A) + rank (B). 


Proof. Write 


A+B = (lnm) (3) =€B. 


Utilizing the previous theorem, we have 
rank (A + B) = rank (D) — dim(Im Dn KerC). (2.10) 
However, 


rank(D) = rank(D*) = rank(A*, B*) 
dim Im(A*, B*) (by Problem 9) 
dim(Im A* + Im B*) 
= dimIm A* + dimIm B* 
— dim(Im A* nN Im B*) 
= rank (A*) + rank (B*) — dim(Im A* n Im B*) 
= rank (A) + rank (B) — dim(Im A* Im B*). 


Substituting this into (2.10) reveals (2.9). lf 
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Problems 
1. 
2. 


10. 


11. 
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Show that rank (A) = dimIm A for any complex matrix A. 


Let A,B ¢ My. If rank(A) < 3 and rank(B) < 4, show that 
det(A + AB) = 0 for some complex number X. 


. If B € M,, is invertible, show that rank (AB) = rank (A) for every 


m X n matrix A. Is the converse true? 


. Is it true that the sum of two singular matrices is singular? How 


about the product? 


. Let A be an m x n matrix. Show that if rank (A) = m, then there 


is ann X m matrix B such that AB = [,,, and that if rank (A) =n, 
then there is an m x n matrix B such that BA = I,. 


. Let A be an m x n matrix. Show that for any s x m matrix X 


with columns linearly independent and any n x t matrix Y with rows 
linearly independent, 


rank (A) = rank (X A) = rank (AY) = rank (X AY). 


. For matrices A and B, show that if rank (AB) = rank (B), then 


ABX =ABY Ss BX =BY, 
and if rank (AB) = rank (A), then 


XAB=YAB S&S XA=YA. 


. Let A be an m Xx n matrix over a field F, rank(A) = r. Show that 


for any positive integer k, r < k <n, there exists an n x n matrix B 
over F such that AB = 0 and rank (A) + rank (B) =k, 


. For any matrices A and B of the same size, show that 


Im(A, B) =ImA+ImB. 


Let A be an m x n matrix. Show that for any n x m matrix B, 


dim Im A + dim Ker A = dimIm(B4A) + dim Ker(B4A). 


Let A and B be m x n and n x m matrices, respectively. Show that 


det(AB)=0 ifm>n. 
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12. 


13. 


14. 


15. 
16. 


17. 


18. 


19. 


20. 


21. 


For matrices A and B of the same size, show that 


|rank (A) — rank (B)| < rank (A+ B). 


Let A and B be n-square complex matrices. Show that 
rank (AB — TI) < rank (A —J) + rank (B-— I). 
Let A € M,,. If A? = J, show that 
rank (A + I) + rank (A — I) =n. 


Show that if A € M,, and A? = A, then rank (A)+rank (I, — A) =n. 
Let A,B € M,. Show that 
(a) rank (A — ABA) = rank (A) + rank (J, — BA) — n. 


(b) If A+ B =I, and rank (A) + rank (B) = n, then 
A? = A, B? = B, and AB=0=BA. 


If A is an n-square matrix with rank r, show that there exist an 
n-square matrix B of rank n—r such that AB = 0. 


Let A and B be nxn matrices over a field F having null spaces W, and 
Wo, respectively. (i). If AB = 0, show that dim W, + dimW2 > n. 
(ii). Show that W, = W2 if and only if A = PB and B = QA for 
some n X n matrices P and Q. 


Let A be an m x n matrix with rank n. If m > n, show that there is 
a matrix B of size (m—n) x m and a matrix C of size m x (m—n), 
both of rank m—n, such that BA = 0 and (A, C) is nonsingular. 


Let A be a linear transformation on a finite-dimensional vector space. 
Show that 
Ker(A) C Ker(A?) C Ker(A®) C--- 


and that 
Im(A) 2 Im(A?) D Im(A?) D---. 
Further show that there are finite proper inclusions in each chain. 


If A is an mxn complex matrix, Im(A) is in fact the space spanned by 
the column vectors of A, called the column space of A and denoted by 
C(A). Similarly, the row vectors of A span the row space, symbolized 
by R(A). Let A and B be two matrices. Show that 


C(A)CC(B) & A=BC, 
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22. 


23. 


24. 


25. 


26. 
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for some matrix C,, and 
R(A) CR(B) = A=RB, 
for some matrix R. [Note: C(A) = Im A and R(A) = Im A’, ] 
Let A and B be m x n matrices. Show that 
C(A+ B) CC(A)+C(B) 

and that the following statements are equivalent. 

(a) C(A) CC(A+ B). 

(b) C(B) CC(A+ B). 

(c) C(A+ B) =C(A)+C(B). 


Prove or disprove, for any n-square matrices A and B, that 

A 

rank = rank (A, B). 

B 

Let A be an m x n matrix and B be a p x n matrix. Show that 
A 
Ker AN Ker B = Ker B): 
Let A and B be matrices of the same size. Show the rank inequalities 
rank (A + B) < rank a) < rank (A) + rank (B) 


and 
rank (A+ B) < rank (A, B) < rank (A) + rank (B) 


by writing 
I A 
a+e=ca.5)(2) =un(4). 
Additionally, show that 
A B 
rank( @ < rank (A) + rank (B) + rank (C) + rank (D). 


Let A, B, and C be complex matrices of the same size. Show that 


rank (A, B,C) < rank (A, B) + rank (B,C) — rank (B). 
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2.4 The Eigenvalues of AB and BA 


For square matrices A and B of the same size, the product matrices 
AB and BA need not be equal, or even similar. For instance, if 


1 0 0 0 
4=(1 0) 2G 4): 


0 0 0 0 
ao=(99) ow pan(2 9), 
Note that in this example both AB (= 0) and BA (¢ 0) have only 
repeated eigenvalue 0 (twice, referred to as multiplicity of 0). Is this 
a coincidence, or can we construct an example such that AB has only 
zero eigenvalues but BA has some nonzero eigenvalues? 


The following theorem gives a negative answer to the question. 
This is a very important result in matrix theory. 


then 


Theorem 2.8 Let A and B bemxn and nx m complex matrices, 
respectively. Then AB and BA have the same nonzero eigenvalues, 
counting multiplicity. [fm =n, A and B have the same eigenvalues. 


Proof 1. Use determinants. Notice that 

Im —A Alm A \ _f{ Alm=-AB 0 

0 Aln B i,j AB Alin 

and that 

Im 0 Adm. A \ _ f Alig A 

—-B Xn BB 2 0 AI,-BA }° 
By taking determinants and equating the right-hand sides, we obtain 

X” det(Alm — AB) = "" det(AT, — BA). (2.11) 


Thus, det(AIm — AB) = 0 if and only if det(AZ, — BA) = 0 when 
\ # 0. It is immediate that AB and BA have the same nonzero 
eigenvalues, including multiplicity (by factorization). 
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Proof 2. Use matrix similarity. Consider the block matrix 
0 O 
BO] 
Add the second row multiplied by A from the left to the first row: 
AB 0 
BO}? 
Do the similar operation for columns to get 
0 0 
B BA }- 


Write, in equation form, 


and 


that is, matrices 


AB 0 ad 0 O 
Bo} * B BA 
are similar. Thus, matrices AB and BA have the same nonzero 


eigenvalues, counting multiplicity. (Are AB and BA similar?) 


Proof 3. Use the continuity argument. We first deal with the case 
where m = n. If A is nonsingular, then 


BA= A (AB)A. 


Thus, AB and BA are similar and have the same eigenvalues. 
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If A is singular, let 6 be such a positive number that «J + A is 
nonsingular for every €, 0 < «<6. Then 


(Il+A)B and B(el +A) 
are similar and have the same characteristic polynomials. Therefore, 
det (AIn — (€In + A)B) = det (AIn — B(eIn + A)), 0<€ <6. 
Since both sides are continuous functions of ¢, letting « + 0 gives 
det(AL, — AB) = det(AI, — BA). 


Thus, AB and BA have the same eigenvalues. 
For the case where m 4 n, assume m <n. Augment A and B by 
zero rows and zero columns, respectively, so that 


hi = i) , By =(B,0) 


are n-square matrices. Then 


AB 0 


ABi = ( 0 0 


) and Bi Ay = BA. 


It follows that A,B, and B,Aj,, consequently AB and BA, have the 
same nonzero eigenvalues, counting multiplicity. 


Proof 4. Treat matrices as operators. It must be shown that if 
Alm — AB is singular, so is AI, — BA, and vice versa. It may be 
assumed that A = 1, by multiplying + otherwise. 

If Im — AB is invertible, let X = (Im — AB)~+. We compute 

Ud, —-— BA)(I,n+ BXA) = I,+BXA-BA-BABXA 
I,+(BXA-— BABXA)-—BA 
= I,+B(In,—-—AB)XA-BA 
I,+ BA-—BA 
= dys 


Thus, I, — BA is invertible. Note that this approach gives no infor- 
mation on the multiplicity of the nonzero eigenvalues. & 
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As a side product, using (2.11), we have (see also Problem 5 of 
Section 2.2), for any m x n matrix A and n x m matrix B, 


det (Im + AB) = det(I, + BA). 


Note that [,, + AB is invertible if and only if J, + BA is invertible. 


Problems 


1. 


Show that tr(AB) = tr(BA) for any m x n matrix A and n x m 
matrix B. In particular, tr(A* A) = tr(AA*). 


. For any square matrices A and B of the same size, show that 


tr(A+ B)? = tr A? + 2tr(AB) + tr B?. 
Does it follow that (A+ B)? = A? + 2AB + B?? 


. Let A and B be square matrices of the same size. Show that 


det(AB) = det(BA). 
Does this hold if A and B are not square? Is it true that 


rank (AB) = rank (BA)? 


. Let A and B be m x n and n x m complex matrices, respectively, 


with m <n. If the eigenvalues of AB are Aj,...,Am, what are the 
eigenvalues of BA? 


. Let A and B be n x n matrices. Show that for every integer k > 1, 


tr(AB)* = tr(BA)*. 


Does 
tr(AB)* = tr(A*B*)? 


. Show that for any z,y € C” 


det(I, + xy*) =1+ y*a. 


. Compute the determinant 


1l+ay X1Y2 aa L1Yn 
T2Y1 14+ roy. °°: L2Yn 


EnV Tn Y2 iia 1 + TnYn 
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8. If A, B, and C are three complex matrices of appropriate sizes, show 
that ABC, CAB, and BCA have the same nonzero eigenvalues. Is 
it true that ABC and CBA have the same nonzero eigenvalues? 


9. Do A* and A have the same nonzero eigenvalues? How about A*A 
and AA*? Show by example that det(AA*) 4 det(A* A) in general. 


10. Let A, B € M,. Show that (2 a) is similar to a a via é a 
11. For any square matrices A and B of the same size, show that 


ae e | 


2 2 
A* + B* and e B? 


have the same nonzero eigenvalues. Further show that the latter 
block matrix must have zero eigenvalues. How many of them? 
12. Let Ac M,. Find the eigenvalues of e 7) in terms of those of A. 


13. Let A be a3 x 2 matrix and B be a 2 x 3 matrix such that 


8 2 -2 
AB = 2 5 4 
—2 4 °=5 


Find the ranks of AB and (AB)? and show that BA = (39). 


14. Let A be an m x n complex matrix and M = ee ) . Show that 


(a) M is a Hermitian matrix. 
(b) The eigenvalues of M are 
mt+n—2r 


—01,..-,—Or, 0,...,0,0,,..-,01, 


where 0; > --- > 0, are the positive square roots of the nonzero 
eigenvalues of A*A, called singular values of A. 


(c) det M = det(—A* A) = (—1)"| det A|? if A is n-square. 
(d) 2 : - = e a + a a) for any matrices B, C. 


15. Let A and B be matrices of sizes m x n and n x m, respectively. Do 
AB and BA have the same nonzero singular values? 


© 
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2.5 The Continuity Argument and Matrix Functions 


One of the most frequently used techniques in matrix theory is the 
continuity argument. A good example of this is to show, as we saw 
in the previous section, that matrices AB and BA have the same set 
of eigenvalues when A and B are both square matrices of the same 
size. It goes as follows. First consider the case where A is invertible 
and conclude that AB and BA are similar due to the fact that 


AB = A(BA)A™t. 


If A is singular, consider A + «J. Choose 6 > 0 such that A + el is 
invertible for all «, 0 <<« <6. Thus, (A+ e/)B and B(A 4+ e/) have 
the same set of eigenvalues for every € € (0,6). 

Equate the characteristic polynomials to get 


det(AT — (A+ eI) B) = det(AL — B(A+€l)), 0<€ <6. 
Since both sides are continuous functions of ¢, letting « > 0* gives 
det(AI — AB) = det(AI — BA). 


Thus, AB and BA have the same eigenvalues. 
The proof was done in three steps: 
1. Show that the assertion is true for the nonsingular A. 
2. Replace singular A by nonsingular A + eI. 
3. Use continuity of a function in € to get the desired conclusion. 


We have used and will more frequently use the following theorem. 


Theorem 2.9 Let A be annxn matrix. If A is singular, then there 
exists a 6 > 0 such that A+ el is nonsingular for all € € (0,6). 


Proof. The polynomial det(AJ + A) in A has at most n zeros. If they 
are all 0, we can take 6 to be any positive number. Otherwise, let 6 
be the smallest nonzero A in modulus. This 6 serves the purpose. IH 

A continuity argument is certainly an effective way for many ma- 
trix problems when a singular matrix is involved. The setting in 
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which the technique is used is rather important. Sometimes the re- 
sult for nonsingular matrices may be invalid for the singular case. 
Here is an example for which the continuity argument fails. 


Theorem 2.10 Let C and D be n-square matrices such that 
CD? + DCT =0. 


If D is nonsingular, then for any n-square matrices A and B 


A B 
G D2 


| = det(AD* + BO"). 


The identity is invalid in general if D is singular. 


Proof. It is easy to verify that 

A B D? 0\_ ( ADT+BC" B 

C D Ce oe ae 0 D}- 
Taking determinants of both sides results in the desired identity. 


For an example of the singular case, we take A, B, C, and D to 
be, respectively, 


1 0 0 O 0 1 0 O 

0 0 }’ O 1 }’ 0 0 }’ 1 0)’ 
where D is singular. It is easy to see by a simple computation that 
the determinant identity does not hold. 


The continuity argument may be applied to more general func- 
tions of matrices. For instance, the trace and determinant depend 
continuously on the entries of a matrix. These are easy to see as the 
trace is the sum of the main diagonal entries and the determinant 
is the sum of all products of (different) diagonal entries. So we may 
simply say that the trace and determinant are continuous functions 
of (the entries of) the matrix. 

We have used the term matrix function. What is a matrix func- 
tion after all? A matrix function, f(A), or function of a matrix can 
have several different meanings. It can be an operation on a matrix 
producing a scalar, such as tr _A and det A; it can be a mapping from 
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a matrix space to a matrix space, like f(A) = A?; it can also be 
entrywise operations on the matrix, for instance, g(A) = (a;,). In 
this book we use the term matrix function in a general (loose) sense; 
that is, a matrix function is a mapping f : A+> f(A) as long as f(A) 
is well defined, where f(A) is a scalar or a matrix (or a vector). 

Given a square matrix A, the square of A, A?, is well defined. 
How about a square root of A? Take A = ie a for example. There 
is no matrix B such that B? = A. After a moment’s consideration, 
one may realize that this thing is nontrivial. In fact, generalizing a 
function f(z) of a scalar variable z € C to a matrix function f(A) is 
a serious business and it takes great effort. 

Most of the terminology in calculus can be defined for square 
matrices. For instance, a matrix sequence (or series) is convergent if 


it is convergent entrywise. As an example, 


1 k=l 
E E 0 1 
[; fi > & a as k + oo. 


For differentiation and integration, let A(t) = (a;;(t)) and denote 


449) _ (Saute), [Awa = (| ay(t)tt). 


That is, by differentiating or integrating a matrix we mean to perform 
the operation on the matrix entrywise. It can be shown that the 
product rule for derivatives in calculus holds for matrices whereas 
the power rule does not. Now one is off to a good start working on 
matrix calculus, which is useful for differential equations. Interested 
readers may pursue and explore more in this direction. 


Problems 


1. Why did the continuity argument fail Theorem 2.10? 


2. Let C and D be real matrices such that CD? + DCT = 0. Show that 
if C is skew-symmetric (i.e., C™ = —C), then so is DC. 


3. Show that A has no square root. How about B and C, where 


010 
4=(9 0): B=(0 00 .o=(% 4) 
000 
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4. Let A= (33). Find a 2 x 2 matrix X so that X? = A. 


5. Use a continuity argument to show that for any A,B € M, 
adj(AB) = adj(B) adj(A). 


6. Show that A, = P.J.P-' if «#0, where 


e 0 0 € 0 0 
A= (fo) F(t) (0 e): 


What happens to the matrix identity if ¢ + 0? Is Ap similar to Jo? 


7. Explain why rank(A?) < rank(A). Discuss whether a continuity 
argument can be used to show the inequality. 


8. Show that the eigenvalues of A are independent of €, where 
e-—1 -1 
A=( eo chy =) 


9. Denote by Omax and Omin, Omax > Amin, the singular values of matrix 


A= (1£), €>0. Show that lim, ,)- omax/Omin = +00. 


10. Let A be a nonsingular matrix with A~! = B = (b;;). Show that 
bi; are continuous functions of a;;, the entries of A, and that if 
lim;_,9 A(t) = A and det A 4 0 (this condition is necessary), then 


lim (A(t))~' = AT}. 


t0 


Conclude that 
lim(A—AI)~' = A“! 
430 


and for any m x n matrix X and n x m matrix Y, independent of e, 
-1 

inn ( Im = ) = Imn- 

11. Let Ae M,. If |A| < 1 for all eigenvalues A of A, show that 


(I-A) b=S°ARaI+ A+ A? 4+ A? +--. 
k=1 


12. Let A= ‘ie a Show that )77°, 4 A* is convergent. 
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13. 


14. 


15. 


16. 
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Let p(x), g(@) be polynomials and A € M,, be such that q(A) is 
invertible. Show that p(A)(q(A))~! = (q(A))~tp(A). Conclude that 
(I — A)~1(1+ A?) = (I+ A?)(I— A) when A has no eigenvalue 1. 


Let n be a positive number and x be a real number. Let 


eae 


Pie Lae jay f 8 1 
tin (im 5-4) =( 2, 9) 


[Hint: A = cP for some constant c and orthogonal matrix P.] 


SIR 


Show that 


For any square matrix X, show that e* = wae ix ® is well defined: 
that is, the series always converges. Let A, B € M,,. Show that 

a) If A=0, then e4 = J. 

(b) If A=T, then e4 = el. 

If AB = BA, then e4+® = ete? = eF eA. 

If A is invertible, then e~4 = (e4)-?. 

If A is invertible, then e4 = Ae? AW}, 


) 
) 
) 
) 
f) If \ is an eigenvalue of A, then e? is an eigenvalue of e4. 
) 
) 
) 
) 


det e4 = ett 4. 
(e4)" =e4"” 


If A is Hermitian, then e 


(h 
(i 
(j 

Let A € M, and t € R. Show that $e'4 = Ac’4 = '4 A. 


‘A is unitary. 


If A is real skew-symmetric, then e“ is (real) orthogonal. 


Qt 


17. Let A(t) = 6 ), where ¢ € R. Find fo A t)dt and 4 A(t). 


1+t nae 
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2.6 Localization of Eigenvalues: The Gersgorin Theorem 


Is there a way to locate in the complex plane the eigenvalues of a 
matrix? The Gersgorin theorem is a celebrated result on this; it 
ensures that the eigenvalues of a matrix lie in certain discs in the 
complex plane centered at the diagonal entries of the matrix. 

Before proceeding, we note that the eigenvalues of a matrix are 
continuous functions of the matrix entries. To see this, as an example, 
we examine the 2 x 2 case for 


re ( G11 12 ) 
a21  a22 
A computation gives the eigenvalues of A 


1 


a5 lan + a2 = V(ai1 — 22)? + daarare |, 


which are obviously continuous functions of the entries of A. 

In general, the eigenvalues of a matrix depend continuously on the 
entries of the matrix. This follows from the continuous dependence of 
the zeros of a polynomial on its coefficients, which invokes the theory 
of polynomials with real or complex coefficients. Simply put: small 
changes in the coefficients of a polynomial can lead only to small 
changes in any zero. As a result, the eigenvalues of a (real or complex) 
matrix depend continuously upon the entries of the matrix. The idea 
of the proof of this goes as follows. Obviously, the coefficients of the 
characteristic polynomial depend continuously on the entries of the 
matrix. It remains to show that the roots of a polynomial depend 
continuously on the coefficients. Consider, without loss of generality, 
the zero root case. 

Let p(x) = x” +a,2""1+---+an with p(0) = 0. Then a, = 0. For 
any positive number e, if q(x) = #”+b,2""!+---+b, is a polynomial 
such that |b; — aj| < €, i= 1,...,n, then the roots 71,...,2n of q(x) 
satisfy |a1--+2%,| = |bn| < €. It follows that |x;| < %/e for some i. 
This means that q(x) has an eigenvalue “close” to 0. 
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Theorem 2.11 The eigenvalues of a matrix are continuous func- 
tions of the entries of the matriz. 


Because singular values are the square roots of the eigenvalues of 
certain matrices, singular values are also continuous functions of the 
entries of the matrix. It is readily seen that determinant and trace 
are continuous functions of the entries of the matrix too. 


Theorem 2.12 (GerSgorin) Let A = (aj;) € Mn and let 


n 


i >. |ai;|, is eer on 


j=l, j#i 
Then all the eigenvalues of A lie in the union of n closed discs 
Uz €C: |z-—ax| < ri}. 
Furthermore, if a union of k of these n discs forms a connected region 


that is disjoint from the remaining n—k discs, then there are exactly 
k, eigenvalues of A in this region (counting algebraic multiplicities). 


Let us see an example before proving the theorem. For 


1 1 0 
A=| 0.25 2 0.25 |, 
0.25 0 3 


there are three Gersgorin discs: 
G,={zeEC: |z—1| < 1}, 
Gp ={zeEC: |z—2| < 0.5}, 


G3 ={zeEC: |z—3| < 0.25}. 


The first two discs are connected and disjoint with G3. Thus 
there are two eigenvalues of A in Gj UG, and one eigenvalue in G3. 
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Figure 2.7: GerSgorin discs 


Proof of Theorem 2.12. Let be an eigenvalue of A, and let x be an 
eigenvector of A corresponding to A. Suppose that zp is the largest 
component of x in absolute value; that is, 


[ol 2 eal? WS Dy Dec iny 


Then x, #0. The equation Ax = Ax gives 


n 
) Apj Xj = Mee 
j=l 


or 
Lp(A — app) => ApjXj- 
j=1, JAD 


By taking absolute value, we have 


nm 
|Zp||A — @pp| = ApjXy 
j=1, JAP 
n 


Slog 
5=1, jp 
nm 


=a Pe 


j=1, JAD 


IA 
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n 


tol SY) apy 


J=1, JAD 


= |xp|rp. 


IA 


It follows that |A—app| < rp, and that ) lies in a closed disc centered 
at @pp, with radius rp. 

To prove the second part, we assume that the first k discs, cen- 
tered at a11,...,@x, form a connected union G which is disjoint from 
other discs. Let A = D+ B, where D = diag(aj1, a22,...,@nn), and 


A.-=D+eB, €€ (0,1 


and denote rj for Ae as rj for A. Then rj = er;. It is immediate that 
for every € € [0,1] the set G contains the union G, of the first k discs 
of A,, where 

G. =UEA{zeEC: |z- axl < ri}. 


Consider the eigenvalues of Ap and A;: 
Ail Ag) =H, AgdAs), CH 12..ieph, €> 0. 


Because the eigenvalues are continuous functions of the entries of A 
and because for each i = 1,2,...,k, 


Ni(Ao) € GeC G, for all € € [0,1], 


we have that each \;(Ao) is joined to some A;(A1) = A;(A) by the 


continuous curve 
{\i(Ace): O<e€ <1} CG. 


Thus for each € € [0,1], there are at least k eigenvalues of A, in G:, 
and G contains at least k eigenvalues of A (not necessarily different). 
The remaining n — k eigenvalues of Ap start outside the connected 
set G, and those eigenvalues of A lie outside G. 


An application of this theorem to A’? gives a version of the the- 
orem for the columns of A. 
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Problems 


1. Apply the Gerggorin theorem to the matrix 


1 1 0 
-1 -1 0 
1 2 4A 


2. Show by the GerSgorin theorem that A has three different eigenvalues: 


os 1 i 0 
A=| 23 6]. -@ 244 
 & & ee 


Conclude that A is diagonalizable, i.e., S~'AS is diagonal for some S. 


3. Construct a 4 x 4 complex matrix so that it contains no zero entries 
and that the four different eigenvalues of the matrix lie in the discs 
centered at 1, —1, 7, and —i, all with diameter 1. 


4. Illustrate the GerSgorin theorem by the matrix 


1 1 1 
a a ac 
1 : 1 
al (Oe ee ee 
2 4 2: 
1 1 1 
q 72 3 2-21 


5. State and prove the first part of GerSgorin theorem for columns. 


6. Let A € M, and D = diag(A); that is, the diagonal matrix of A. 
Denote B = A—D. If X is an eigenvalue of A and it is not a diagonal 
entry of A, show that 1 is an eigenvalue of (AJ — D)~'B. 


7. Let A= (a;;) € M,. Show that for any eigenvalue \ of A 


n 
|A| > min{|ai| —ri}, where r; = pa lites 
j=l j#i 


Derive that 
| det A] > (min{|a;;| — r;})”. 


8. Let A = (a;;) € M,. Show that for any eigenvalue » of A 


n n 
|A| < min { max ) |ais|, max 5 jul. 
a ui 
g=1 t=1 
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10. 


11. 


12. 


13. 


14. 
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. Let A = (a;;) € M,,. Show that for any eigenvalue A of A 


|A| < nee |a,;|. 


Let A = (a;;) € M,. Show that for any positive numbers dj,..., dn 
and for any eigenvalue A of A 


1S ly 
|A| < min { max ZY dala mee Dale 


(Levy—Desplanques) A matrix A € M,, is said to be strictly diag- 
onally dominant if 


n 


lai3| > PS leg |} FHA, Dy ncn gM 
i=1, Aj 


Show that a strictly diagonally dominant matrix is nonsingular. 


Let A = (aiz) € Mp and R; = D05_, |aijl, i = 1,2,...,n. Suppose 
that A does not have a zero row. Show that 


ais 


rank (A) > a R 
i=1 7 


Let A = (Giz) € M,. Denote 6 = ming; |aii = a;;| and € = 
max;z; |a;;|. If 6 > 0 and € < 6/4n, show that each Gerggorin disc 
contains exactly one eigenvalue of A. 


Let f(z) = 2" +a,2"~!+---+4a,, be any monic polynomial. Show 
that all the roots of f are bounded (in absolute value) by 


_ 9 ; 1/k 
ay max, |ax| 


([Hint: Consider | f(a)/a”| and show that |f(a)/x"| > 0 if |a| > +7.] 


CHAPTER 3 


Matrix Polynomials and Canonical Forms 


Introduction: This chapter is devoted to matrix decompositions. 
The main studies are on the Schur decomposition, spectral decom- 
position, singular value decomposition, Jordan decomposition, and 
numerical range. Attention is also paid to the polynomials that anni- 
hilate matrices, especially the minimal and characteristic polynomi- 
als, and to the similarity of a complex matrix to a real matrix. At the 
end we introduce three important matrix operations: the Hadamard 
product, the Kronecker product, and compound matrices. 


3.1 Commuting Matrices 


Matrices do not commute in general. One may easily find two square 
matrices A and B of the same size such that AB 4 BA. Any square 
matrix A, however, commutes with polynomials in A. 

A question arises: If a matrix B commutes with A, is it true that 
B can be expressed as a polynomial in A? The answer is negative, 
by taking A to be the n x n identity matrix J and B to be ann xn 
nondiagonal matrix. For some sorts of matrices, nevertheless, we 
have the following result. 


Theorem 3.1 Let A and B benxn matrices such that AB = BA. If 
all the eigenvalues of A are distinct, then B can be expressed uniquely 
as a polynomial in A with degree no more than n — 1. 
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Proof. To begin, recall from Theorem 1.6 that the eigenvectors be- 
longing to different eigenvalues are linearly independent. Thus, a 
matrix with distinct eigenvalues has a set of linearly independent 
eigenvectors that form a basis of C” (or R” if the matrix is real). 

Let w1, U2,...,Un be the eigenvectors corresponding to the eigen- 
values Ay, A2,...,An of A, respectively. Set 


P= (ui, U2, oe , Un). 
Then P is an invertible matrix and 
PAP = diag()y, Az, ---,An)- 


Let 
P"'AP=C and P'BP=D. 
It follows from AB = BA that CD = DC. 


The diagonal entries of C are distinct, therefore D must be a 
diagonal matrix too (Problem 1). Let D = diag(t1, W2,..-, Mn). 


Consider the linear equation system of unknowns x0, %1,..-,%n—1: 
el 
tot Agi t-::+Ay En-1 = /1, 
= 
wo + Agri +--+ +AQ fn-1 = He, 


rg + Anti bee b AR ei = Bn: 

Because the coefficient matrix is a Vandermonde matrix that is 
nonsingular when Aj, A2,...,An are distinct (see Problem 3 of Sec- 
tion 1.2 or Theorem 5.9 in Chapter 5), the equation system has a 
unique solution, say, (do, @1,...,@n—1). 

Define a polynomial with degree no more than n — 1 by 


p(x) = ag + aye + age? +++ + anya". 


It follows that 
DO) Sty 9 = 12, 2204 


It is immediate that p(A) = B since p(C) = D and that this poly- 
nomial p(x) is unique for the solution to the system is unique. & 
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Such a method of finding a polynomial p(x) of the given pairs is 
referred to as interpolation. The proof also shows that there exists an 
invertible matrix P such that P~'AP and P~!BP are both diagonal. 


Theorem 3.2 Let A and B be square matrices of the same size. If 
AB = BA, then there exists a unitary matrix U such that U* AU and 
U*BU are both upper-triangular. 


Proof. We use induction on n. If n = 1, we have nothing to show. 
Suppose that the assertion is true for n — 1. 

For the case of n, we consider matrices as linear transformations. 
Note that if A is a linear transformation on a finite-dimensional vec- 
tor space V over C, then A has at least one eigenvector in V, for 


Ax =x, «#0, ifandonly if det(AI — A) =0, 


which has a solution in C. 
For each eigenvalue py of B, consider the eigenspace of B 


V.={v eC”: Bu =p}. 
If A and B commute, then for every v € V,, 
B(Av) = (BA)v = (AB)v = A(Bv) = A(uv) = (Av). 


Thus, Av € V,,; that is, V,, is an invariant subspace of A. As a linear 
transformation on V,,, A has an eigenvalue A and a corresponding 
eigenvector v; in V,. Put in symbols, 


Avy = Avy, Buy =p, 1 € Vp. 


We may assume that v; is a unit vector. Extend v, to a unitary 
matrix Uj; that is, Uy is a unitary matrix whose first column is v,. 
By computation, we have 


uA = (4 2 and ui BU =( 4 ae 


where C, D € Myn_-1, and a and 6 are some row vectors. 
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It follows from AB = BA that CD = DC. The induction hy- 
pothesis guarantees a unitary matrix Uz € M,_1 such that U;CU2 
and U3; DU? are both upper-triangular. Let 


1 0 
r= ( 5 ae 


Then U, a product of two unitary matrices, is unitary, and U* AU 
and U* BU are both upper-triangular. 


Problems 


1. Let A be a diagonal matrix with different diagonal entries. If B is a 
matrix such that AB = BA, show that B is also diagonal. 


2. Let A,B € M,, and let A have n distinct eigenvalues. Show that 
AB = BA if and only if there exists a set of n linearly independent 
vectors as the eigenvectors of A and B. 


3. Let A= (% 5) and B= (1, }). Show that AB = BA. Find 


a unitary matrix U such that both U*AU and U* BU are upper- 
triangular. Show that such a U cannot be a real matrix. 


4. Give an example of matrices A and B for which AB = BA, X is an 
eigenvalue of A, py is an eigenvalue of B, but A+ is not an eigenvalue 
of A+ B, and Ap is not an eigenvalue of AB. 


5. Let f(a) be a polynomial and let A be an n-square matrix. Show 
that for any n-square invertible matrix P, 
f(P~1AP) = P“1f(A)P 
and that there exists a unitary matrix U such that both U* AU and 
U* f(A)U are upper-triangular. 


6. Show that the adjoints, inverses, sums, products, and polynomials of 
upper-triangular matrices are upper-triangular. 


7. Show that every square matrix is a sum of two commuting matrices. 

8. If A and B are two matrices such that AB = J,, and BA = I,,, show 
that m=n, AB = BA=T, and B= A™!. 

9. Let A, B, and C be matrices such that AB = CA. Show that for 
any polynomial f(x) 


o] 
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10 


11. 
12. 


13. 


14. 


15. 


16. 


17. 


18. 
19. 


20. 


21. 


22. 


23. 


Is it true that any linear transformation on a vector space over R has 
at least a real eigenvalue? 


Show that if AB = BA = 0 then rank (A+ B) = rank (A) +rank (B). 


Let Aj, Ag,..., Ax € Mn be commuting matrices, i.e., A;A; = A; A; 
for all 7,7. Show that there exists a unitary matrix U © M, such 
that all U*A;U are upper-triangular. 


Let A and B be commuting matrices. If A has k distinct eigenvalues, 
show that B has at least k linearly independent eigenvectors. Does 
it follow that B has k distinct eigenvalues? 


Let A and B be n-square matrices. If AB = BA, what are the 
eigenvalues of A+ B and AB in terms of those of A and B? 


Let A, Be M,. If AB = BA, find the eigenvalues of the matrix 


A B 
B -A}° 
What matrices in M,, commute with all diagonal matrices? With all 


Hermitian matrices? With all matrices in M,,? 


Let A and B be square complex matrices. If A commutes with B 
and B*, show that A+ A* commutes with B+ B*. 


Show that Theorem 3.2 holds for more than two commuting matrices. 


What conclusion can be drawn from Theorem 3.2 if B is assumed to 
be the identity matrix? 


Let A and B be complex matrices. Show that 
AB=A+B => AB=BA. 


Let A and B be n- and m-square matrices, respectively, with m <n. 
If AP = PB for an n X m matrix P with columns linearly indepen- 
dent, show that every eigenvalue of B is an eigenvalue of A. 


If A and B are nonsingular matrices such that AB — BA is singular, 
show that 1 is an eigenvalue of A~1B~1 AB. 


Let A and B be n x n matrices such that rank(AB—-— BA) < 1. 
Show that A and B have a common eigenvector. Find a common 
eigenvector (probably belonging to different eigenvalues) for 


a=(7 2). 8=(4 6): 
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24. 


25. 


26. 


27. 
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Let S={AB-— BA: A,B © M,}. Show that Span S$ is a subspace 
of M,, and that 
dim(Span $) = n? — 1. 


Let A and B be 2n x 2n matrices partitioned conformally as 


Ay 0 By By 
A= , B= : 
( 0 Aoo ) ( Bay Bop 
If A commutes with B, show that A;, and Ago commute with By, 
and Boo, respectively, and that for any polynomial f(z) 


f(Ai1)Bi2 = Biz f(A22), f(Ag2)Bor = Bor f(A11). 


In particular, if Ay; = a,J and Ag2 = agl with a; 4 ag, then By = 
Bz, = 0, and thus B = By, © Bag. The same conclusion follows if 
f(Ai1) = 0 (or singular) and f(A22) is nonsingular (respectively, 0). 


Let S be the n x n backward identity matrix; that is, 


0 0... 0 1 
0 0... 1 0 
S= ane : ae 
OL. gear 70? 0 
1 0 0 0 


Show that S~! = ST = S; equivalently, ST = S, S? =I, STS =I. 
What is det S? When n = 3, compute SAS for A = (aij) € Mz. 


Let A = (a;;) be an n x n matrix and S be the n x n backward 
identity. Denote A = (Giz) = SAS. Show that aij = Qn—-i+1,n—j+1- 
[Note: The matrix SAS can be obtained by any of the following 
methods: (1) Relist all the rows in the reverse order then relist all the 
columns of the resulting matrix; (2) Flip the matrix along the main 
diagonal then flip the resulting matrix along the backward diagonal 
@in;+-+;@n1; or (3) Rotate the matrix 180° in the plane; that is, hold 
the upper-left corner, then rotate the paper by 180°.] 


If A is one of the following matrices, show that the other one is SAS. 


? 


oxr= 


© 
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x 
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oo 
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3.2. Matrix Decompositions 


Factorizations of matrices into some special sorts of matrices via sim- 
ilarity are of fundamental importance in matrix theory. We study the 
following decompositions of matrices in this section: the Schur de- 
composition, spectral decomposition, singular value decomposition, 
and polar decomposition. We also continue our study of Jordan de- 
composition in later sections. 


Theorem 3.3 (Schur Decomposition) Let \1,\2,...,An be the 
eigenvalues of AC M,. Then there exists a unitary matrix U € My, 
such that U* AU is an upper-triangular matrix. In symbols, 


AY * 
U* AU = 
0 An 
Proof. This theorem follows from Theorem 3.2. We present a pure 
matrix proof below without using the theory of vector spaces. 
If n = 1, there is nothing to show. Suppose the statement is true 
for matrices with sizes less than n. We show by induction that it is 


true for the matrices of size n. 
Let x, be a unit eigenvector of A belonging to eigenvalue 1: 


Ax} = A424, Ly t 0. 
Extend x; to a unitary matrix S = (x1, yo, ---,; Yn). Then 


AS = (Ary, Ay, nas , Ayn) 
(A121, Ayo, mae , Ayn) 
— S(u, St Ays,...,57 Ayn), 


where u = (A;,0,...,0)". Thus, we can write 


ok = Ay U 
sas=( 7 oa 
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where v is a row vector and B € M,,_}. 
Applying the induction hypothesis on B, we have a unitary ma- 
trix T of size n — 1 such that 7* BT is upper-triangular. Let 


Then U, a product of two unitary matrices, is unitary, and U* AU 
is upper-triangular. It is obvious that the diagonal entries A; of the 
upper-triangular matrix are the eigenvalues of A. Hl 


A weaker statement is that of triangularization. For every A € 
M.,, there exists an invertible P such that P~!AP is upper-triangular. 

Schur triangularization is one of the most important theorems in 
linear algebra and matrix theory. It is used repeatedly in this book. 
As an application, we see by taking the conjugate transpose that 
any Hermitian matrix A (i.e., A* = A) is unitarily diagonalizable. 
The same is true for normal matrices A, because the matrix identity 
A*A = AA*, together with the Schur decomposition of A, implies 
the desired decomposition form of A (Problem 4). 

A positive semidefinite matrix A € M,, by definition, x*Ax > 0 
for all  € C” (instead of R”), has a similar structure. To see this, 
it suffices to show that a positive semidefinite matrix is necessarily 
Hermitian. This goes as follows. 

Since x* Ax > 0 for every x € C”, we have, by taking x to be the 
column vector with the sth component 1, the tth component c € C, 
and 0 elsewhere, 


x* Ax = ass + ar|c|? + ars€ + Gstc > 0. 


It follows that each diagonal entry a,, is nonnegative by putting 
c=0 and that a, = G@, or A* = A by putting c = 1, 2, respectively. 
Note that if A is real and x7 Az > 0 for all real vectors x, A need 
not be symmetric. Let A = he . Then x7 Ax = 0 for all x € R?. 
It is immediate that the eigenvalues 4 of a positive semidefinite 
A are nonnegative since «* Ax = Ax*x for any eigenvector «x of X. 
We summarize these discussions in the following theorem. 
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Theorem 3.4 (Spectral Decomposition) Let A be an n-square 
complex matrix with eigenvalues r1,A2,--.,An- Then A is normal 
if and only if A is unitarily diagonalizable; that is, there exists a 
unitary matriz U such that 


U*AU = diag(A1, A2,---,n)- 


In particular, A is Hermitian if and only if the A; are all real and is 
positive semidefinite if and only if the A; are all nonnegative. 


As aresult, by taking the square roots of the \; in the decomposi- 
tion, we see that for any positive semidefinite matrix A, there exists 
a positive semidefinite matrix B such that A = B?. We show such 
a matrix B is unique (see also Section 7.1) and call it a square root 
of A, denoted by A!/?. In addition, we write A > 0 if A is positive 
semidefinite and A > 0 if A is positive definite; that is, c*Ar > 0 
for all nonzero x € C”. For two Hermitian matrices A and B of the 
same size, we write A> Bif A—-B>OandA>Bif A-B>0. 


Theorem 3.5 (Uniqueness of Square Root) Let A be an n-square 
positive semidefinite matrix. Then there is a unique n-square positive 
semidefinite matrix B such that B? = A. 


Proof. Let A = U* diag(Aq, A2,.-.,An)U be a spectral decomposi- 
tion of A. Take B = U* diag(V/A1, VA2,---,VAn )U. Then B > 0 
and B? = A. For uniqueness, let C' also be a positive semidefinite 
matrix such that C? = A and, by the spectral decomposition, write 
C = V* diag(p1, H2,..-,n)V (actually pn; = V/A; ). Then B? = C? = 
A implies that U* diag(\1, A2,---,An)U = V* diag(u?, u3,..., u2)V; 
that is, W diag(A1, A2,.--, An) = diag(u?, u2,..., u2)W, where W = 
(wiz) = VU*. This results in wyAj = pewij for all 7,7. It follows 
that wig Sj = pw. Therefore, W diag(W\1, VX2,---;WAn) = 
diag(f11, 2,---;Un)W, which reveals B=C. UW 


Such a B in the theorem is called the square root of A and is de- 
noted by A!/?. The proof shows that if A = U* diag(Aq, A2,...,An)U, 
where all ; > 0, then A/? = U* diag(V/Aq, ra, -.-, VAn JU. Like- 
wise, Al/3 = U* diag({/X1, WAg,---, An )U. Similarly, for any r > 
0, one may verify the following A” is well defined (Problem 30): 


A” =U" diag(A7, AS, ..., AX )U. 
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The singular values of a matrix A are defined to be the non- 
negative square roots of the eigenvalues of A*A, which is positive 
semidefinite, for x*(A*A)x = (Az)*(Az) > 0. If we denote by oj a 
singular value and by \; an eigenvalue, we may simply write 


oi(A) = /ri(A*A). 


Theorem 3.6 (Singular Value Decomposition) Let A be an 


mxn matriz with nonzero singular values 01,02,...,0r. Then there 
exist anm xm unitary U and ann x n unitary V such that 
D, O 
A=U 7 V, 3.1 
Gay (3.1) 
where the block matrix is of sizemxn and D, = diag(o1, 02,...,0r). 


Proof. If A is a number c, say, then the absolute value |c| is the 
singular value of A, and A = |cle”’ for some 6 € R. If A is a nonzero 
row or column vector, say, A = (a1,...,@,), then o; is the norm 
(length) of the vector A. Let V be a unitary matrix with the first 
row the unit vector (a, ae a An). Then A = (01,0,...,0)V. 

We now assume m > 1, n> 1, and A #0. Let wu; be a unit 
eigenvector of A* A belonging to Ore that is, 


(A*A)uy = 02m, ujur = 1. 
Let 
1 
y= —Auy. 
O1 


Then vy; is a unit vector and a simple computation gives uj A*v; = 0}. 

Let P and Q be unitary matrices with u,; and v, as their first 
columns, respectively. Then, with A*v, = o,u and uj A* = (Au1)* = 
01U}, we see the first column of P* A*Q is (01, 0,...,0)” and the first 
row is (01,0,...,0). It follows that 


& Ae _ OL 0 _ O71 0 * 
para= (4 z) or A=a(% ae 
for some (n—1)x(m—1) matrix B. The assertion follows by repeating 
the process (or by induction) on B*. 
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Apparently the rank of A equals r as U and V are nonsingular. 
If A is a real matrix, U and V can be chosen to be real. Besides, 
when A is ann Xn matrix, then U and V are n x n unitary matrices. 
By inserting VV* between U and the block matrix in (3.1), we have 
A=UVV*DV = WP, where W = UV is unitary and P = V* DV 
is positive semidefinite. Since A*A = PW*WP = P?, we see P = 
(A*A)!/?, which is uniquely determined by the matrix A. We denote 


[4| = (4A)! 
and call it the modulus of A. Note that |A| is positive semidefinite. 


Theorem 3.7 (Polar Decomposition) For any square matrix A, 
there exist unitary matrices U and V_ such that 


A=WI|Al =|A"lV. 


The polar decomposition may be generalized for rectangular ma- 
trices with partial unitary matrices (Problem 24). The polar de- 
composition was proven by the singular value decomposition. One 
may prove the latter using the polar decomposition and the spectral 
decomposition. 


Problems 


1. Let B and D be square matrices (of possibly different sizes). Let 
BC 
nf 
Show that every eigenvalue of B or D is an eigenvalue of A. 


2. Find a matrix P so that P~'AP is diagonal; then compute A’, where 


el 7) 


3. Show that the complex symmetric A is not diagonalizable, where 


as, 


That is, P~'AP is not diagonal for any invertible matrix P. 
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10. 


11. 
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. If A is an upper-triangular matrix such that A* A = AA*, show that 


A is in fact a diagonal matrix. 


. Let A be an n-square complex matrix. Show that 


x*Ax=0 foralxeC” Ss A=0 


and 
vt Ax=0 forallzaeR” Ss AT=-A. 


. Let AEG M,. If «*Aa € R for all « € C”, show that A* = A by 


the previous problem or by making use of the spectral theorem on 
A-— A*. (Note that A — A* is skew-Hermitian, thus normal.) 


. Let A € M,. Show that if \ is an eigenvalue of A, then * is an 


eigenvalue of A*, and that a € C is an eigenvalue of f(A) if and only 
if a = f(A) for some eigenvalue A of A, where f is a polynomial. 


. Show that if A € M, has n distinct eigenvalues, then A is diagonal- 


izable. Does the converse hold? 


. Let A be an n-square positive semidefinite matrix. Show that 


(a) X*AX > 0 for every n x m matrix X. 


(b) Every principal submatrix of A is positive semidefinite. 


Let 
01 0 0 0 0 
A=|{0 0 -1], B= 1 0 0 
0 0 0 01 0 


(a) Show that A? = B3 = C? = 0, where C= \A+ uB, A, wEC. 
(b) Does there exist an integer k such that (AB)* = 0? 


(c) Does there exist a nonsingular matrix P such that P~'AP and 
P-'BP are both upper-triangular? 


Let A be an n-square matrix and let P be a nonsingular matrix of 
the same size such that P~!AP is upper-triangular. Write 


P“AP=D-U, 


where D is a diagonal matrix and U is an upper-triangular matrix 
with 0 on the diagonal. Show that if A is invertible, then 


A} = pp"! ( +UD14(UD-1)? 4.--+ coe. 
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12. 


13. 


14. 


15. 


16. 


are 


18. 


19. 


Let A be an n x n real matrix. If all eigenvalues of A are real, then 
there exists an n x n real orthogonal matrix Q such that Q7 AQ is 
upper-triangular. What if the eigenvalues of A are not real? 


Let A be an n-square complex matrix. Show that 


(a) (QR Factorization) There exist a unitary matrix Q and an 
upper-triangular matrix R such that A = QR. Furthermore 
Q and R can be chosen to be real if A is real. 


(b) (LU Factorization) If all the leading principal minors of A are 
nonzero, then A = LU, where L and U are lower- and upper- 
triangular matrices, respectively. 


Let X;; denote the (2,7)-entry of matrix X. If A > 0, show that 
(AY?) ,; < (Ay)/?. 


Let A € M,, have rank r. Show that A is normal if and only if 


r 

x 

A= ) AjpUitl; 5 
i=1 


where A; are complex numbers and u,; are column vectors of a unitary 
matrix. Further show that A is Hermitian if and only if all ; are real, 
and A is positive semidefinite if and only if all \; are nonnegative. 


Let A be an m x n complex matrix with rank r. Show that 


(a) A has r nonzero singular values. 
(b) A has at most r nonzero eigenvalues (in case m = 7). 


(c) A=U,D,V,, where U, isan mxr matrix, D,=diag(o1,...,,r), 
V, is an r x n matrix, all of rank r, and UU, = V,V,* = I,. 


(d) A= ry o;u;v;, where the o; are the singular values of A, 
and u; and v; are column vectors of some unitary matrices. 


Let A and B be upper-triangular matrices with positive diagonal 
entries. If A = UB for some unitary U, show that U = J and A = B. 


Let A be a square matrix. Show that A has a zero singular value if 
and only if A has a zero eigenvalue. Does it follow that the number 
of zero singular values is equal to that of the zero eigenvalues? 


Show that two Hermitian matrices are (unitarily) similar if and only 
if they have the same set of eigenvalues. 
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29. 


30. 


bl. 


Matrix Polynomials and Canonical Forms Chap. 3 


Let A = P,U; = P2U2 be two polar decompositions of A. Show that 
P, = Pz and U, = U2 if A is nonsingular. What if A is singular? 


. Let P be a positive definite matrix and U and V be unitary matrices 


such that UP = PV. Show that U = V. 


. Show that if A is an n-square complex matrix, then there exist non- 


singular matrices P and Q such that (PA)? = PA, (AQ)? = AQ. 


. Let Ac M,. Show that |A| = UA for some unitary matrix U. 
. State and show the polar decomposition for rectangular matrices. 


. Let A be an m x n complex matrix of rank r. Show that A = ST for 


some m X r matrix S and r x n matrix T; both have rank r. 


. Show that |Ay---An| = 01-+:on for any A € M,, where the \; and 


o; are the eigenvalues and singular values of A, respectively. 


. What can be said about A € M, if all its singular values are equal? 


Can the same conclusion be drawn in the case of eigenvalues? 


. For any n-square complex matrix A, show that 


A*+A 1 
< * /2 
tr (S ) <tr ((A A) ) ‘ 
If A is a matrix with eigenvalues A; and singular values o;, show that 
S > |Ac? < tr(A*A) = tr(AA*) = S© Jais|? = 5° 07. 
i ij i 


Equality occurs if and only if A is normal. Use this and the matrix 
with (¢,¢+ 1) entry /2j, i=1,2,...,n—1, (n,1) entry /z,, and 0 
elsewhere to show the arithmetic mean-geometric mean inequality 


n 1/n 1 n 


Let A > 0. Show that the definition A” = U* diag(A7, A$,..., A7)U 
is independent of the unitary matrix U. In other words, if A = 
V* diag(A1, r2, sey An)V = W* diag(A1, r2, seey An)W, V, W are uni- 
tary, then V* diag(Aj, A5,..., A, )V = W* diag(A7, A5,..., A, )W. 


Let A,B > 0 and U be unitary, all n x n. Show that for any r > 0, 
(a) (U* AU)” = U*ATU. 
(b) If AB = BA, then AB > 0 and (AB)" = A’B’. 
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3.3 Annihilating Polynomials of Matrices 


Given a polynomial in with complex coefficients am,am_1,-.--, 40, 
FO) = Gan” fag N 4+ SGA tom, 
one can always define a matrix polynomial for A € M,, by 
fA SoA" 4a 4A Se PGA tol. 


We consider in this section the annihilating polynomials of a ma- 
trix; that is, the polynomials f(A) for which f(A) = 0. Particular 
attention is paid to the characteristic and minimal polynomials. 


Theorem 3.8 Let A be an n-square complex matrix. Then there 
exists a nonzero polynomial f(A) over C such that f(A) = 0. 


Proof. M,, is a vector space over C of dimension n?. Thus, any n?+1 
vectors in M,, are linearly dependent. In particular, the matrices 


2 


1, A, A2,..., A” 


are linearly dependent; namely, there exist numbers ao, a1, d2,..-, G2, 
not all zero, such that 


Gig l + aA teagA* eee On2A™ = (0, 
Set ; 

FIA) = 09 Hadad bs bad" 
Then f(A) =0, as desired. I 


Theorem 3.9 (Cayley—Hamilton) Let A be an n-square complex 
matrix and let p,(A) be the characteristic polynomial of A; that is, 


p,(\) = det(AI — A). 


Then 
p,{A) = 0. 
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Proof. Let the eigenvalues of A be Ay, A2,...,An. We write A by 
triangularization as, for some invertible matrix P, 


A=P7'TP, 


where T is an upper-triangular matrix with A,,A2,...,An on the 
diagonal. Factor the characteristic polynomial p(A) = p,(A) of A as 


P(A) = (A A1)(A = Aa) ++ (A= An). 
Then 
p(A) = p(P“ TP) = P-'p(T)P. 
Note that 
p(T) = (T — Ail) (T — Al) - + - (LT — Ant). 


It can be shown inductively that (T—2,J)---(L—AxL) has the first k 
columns equal to zero, 1 < k <n. Thus, p(T’) = 0, and p(A) = 0. Wf 


A monic polynomial m(A) is called the minimal polynomial of a 
matrix A if m(A) = 0 and it is of the smallest degree in the set 
{f(A) + F(A) = Of. 
It is immediate that if f(A) = 0, then m(A) divides f(A), or, in 
symbols, m(A)| f(A), because otherwise we may write 
FA) = aAr)ym() +r); 


where r(\) 4 0 is of smaller degree than m(A) and r(A) = 0, a con- 
tradiction. In particular, the minimal polynomial divides the char- 
acteristic polynomial. Note that both the characteristic polynomial 
and the minimal polynomial are uniquely determined by its matrix. 


Theorem 3.10 Similar matrices have the same minimal polyno- 
mial. 


Proof. Let A and B be similar matrices such that A = P~!BP 
for some nonsingular matrix P, and let m,(A) and m,(A) be the 
minimal polynomials of A and B, respectively. Then 


m,(A) =m,(P'BP) = P-'m,(B)P =0. 
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Thus, m, (A) divides m, (A). Similarly, m,(A) divides m, (A). Hence 
m,(A) =m, (A) as they are both of leading coefficient 1. I 


For a polynomial with (real or) complex coefficients 
(a) = 2" + ena"! +--+ ee + 09, 


one may construct an n-square (real or) complex matrix 


0 0 —Co0 

1 0 =] 
C= 

OF cakis al —Cpn-1 


Such a matrix C is known as the companion matrix of p(x). 

By expanding det(xI—C), we see p(x) = det(xI—C); that is, p(x) 
is the characteristic polynomial of its companion matrix. p() is also 
the minimal polynomial of C’. For this end, let e1,e€2,...,en, be the 
column vectors with the 1st, 2nd, ...,nth component 1, respectively, 
and all other components 0. Then Ce; is the ith column of C, i = 
1,2,...,n. By looking at the first n — 1 columns of C, we have 


Ce, = €2, Ceg = €3 = C%e1, ..., Cen_1 = €n = OC” 1] 


and 


Cen = —cpe1 — Cl€2 — +++ — Cn—1n- 


If g(x) = 2™ + dmiiz™ | + dm_ox™? + --- + dix + dp is such a 
polynomial that q(C) = 0, m <n, we compute q(C)e; = 0 to get 


0 = C™e, +dm—1C™ 1e1 + dm_2C™ 7e] +++» + diCe, + doer 
= Get +} OpoiGa + Om see t bY Pie + doa: 


This says that e1,e€2,...,€n are linearly dependent, a contradiction. 

An effective method of computing minimal polynomials is given in 
the next section in conjunction with the discussion of Jordan canon- 
ical forms of square matrices. 
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2: 


10. 


11. 


12. 
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Let A = (37) and f(x) = 3a? — 5a — 2. Find f(A). 


Find the characteristic and minimal polynomials of the matrices 


(00): (03): G2): 


. Find the characteristic and minimal polynomials of the matrices 


nN 


1 0 
| © & 4 
0 0X 


2.- =2 0 
A={|-2 1 -2 
0 2 0 


. Compute det(AB— A) and f(A), where f(A) = det(AB — A) and 


an(38). 29(3 3) 


. Let A € M,. Show that there exists a polynomial f(a) with real 


coefficients such that f(A) = 0. 


. Let A and B be nxn matrices, and let f(\) = det(AJ—B). Show that 


f(A) is invertible if and only if A and B have no common eigenvalues. 


. Let A and B be square matrices of the same size. Show that if A 


and B are similar, then so are f(A) and f(B) for any polynomial f. 


. Let A and B be n x n matrices. If A and B are similar, show that 


f(A) = 0 if and only if f(B) = 0. Is the converse true? 


Show that rank(AB) = n — 2 if A and B are n-square upper- 
triangular matrices of rank n — 1 with diagonal entries zero. 


Let Aj,..., Am be upper-triangular matrices in M,,. If they all have 
diagonal entries zero, show that A,---A,, = 0 when m > n. 


Explain what is wrong with the following proof of the Cayley—-Hamilton 
theorem. Because p(A) = det(AI — A), plugging A for directly in 
both sides gives p(A) = det(A — A) = 0. 
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13. 


14. 


15. 


16. 


17. 


18. 


19. 


20. 


As is known, for square matrices A and B of the same size, AB and 
BA have the same characteristic polynomial (Section 2.4). Do they 
have the same minimal polynomial? 


Let f(x) be a polynomial and let \ be an eigenvalue of a square 
matrix A. Show that if f(A) = 0, then f(A) = 0. 


Let v € C” and Ac M,,. If f(A) is the monic polynomial with the 
smallest degree such that f(A)v = 0, show that f(A) divides m4(A). 


Let co, C1,---,€n—1 € C and let C and D be, respectively, 


0 0 ieee —Co —Cn-1 —Cn-~2 see COE 
1 OO ... -ey 1 0 vex 0 
O ... 1 -—ep-1 0 lat 1 0 


Show that SCS = D™, where S is the backward identity; that is, 
0 1 
S= ; 
1 0 
Show that the matrices C, C7, D, and D7? all have the polynomial 
p(x) = 2" +Cp1e" +--+ +x +09 


as their characteristic and minimal polynomials. 


Let C be the companion matrix of the polynomial p(a) = 2” + 
Cn—-12" 1 +--»+ea2+¢9. Show that C is nonsingular if and only if 
co # 0. In case where C is nonsingular, find the inverse of C. 


Let A,B,C € My. If X € My satisfies AX? + BX + C =0 and if 
is an eigenvalue of X, show that \7A + AB+C is singular. 


Let A € My, p(w) = det(aI — A), and P(x) = adj(aJ — A). Show 
that every entry in the matrix 2* P(x) — P(x) A* is divisible by p(x) 
for k = 1,2,.... 


Let A and B be n-square complex matrices. Show that 
AX-XB=0 => f(A)X-Xf(B)=0 


for every polynomial f. In addition, if A and B have no common 
eigenvalues, then AX — X B = 0 has only the solution X = 0. 
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22. 


23. 


24. 


25. 
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Let A and B be 2n x 2n matrices partitioned conformally as 
Ay 0 By Bi 
A= , B= ; 
( 0 Ade ) ( By, Boo 
If AB = BA and A, and Ag2 have no common eigenvalue, show that 
Big = Bo =0. 


Show that for any nonsingular matrix A, matrices A~t and adj(A) 
can be expressed as polynomials in A. 


Express J~! as a polynomial in J, where 


1 1 0 

J= : 
1 
0 1 


For a square matrix X, we denote e* = 7, GX*. Let A bea 
square matrix with all its eigenvalues equal to A. Show that 


fe eA (ASAT, FEC. 
e€ OD a AT)", tec 


In particular, if A is a 3 x 3 matrix having eigenvalues A, A, A, then 


e4 = A(T +t(A-Al)+ (A \I)?). 


Let X and Y be positive semidefinite matrices of the same size such 
that X > Y,i.e., X—Y > 0. Does it necessarily follow that eX > eY? 
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3.4 Jordan Canonical Forms 


We saw in Section 3.2 that a square matrix is similar (and even 
unitarily similar) to an upper-triangular matrix. We now discuss the 
upper-triangular matrices and give simpler structures. 

The main theorem of this section is the Jordan decomposition, 
which states that every square complex matrix is similar (not neces- 
sarily unitarily similar) to a direct sum of Jordan blocks, referred to 
as Jordan canonical form or simply Jordan form. A Jordan block is 
a square matrix in the form 


x ol 0 

—— =: (3.2) 
ie a 
0 x 


For this purpose we introduce A-matrices as a tool and use ele- 
mentary operations to bring \-matrices to so-called standard forms. 
We then show that two matrices A and B in My, are similar if and 
only if their \-matrices AJ — A and AI — B can be brought to the 
same standard form. Thus, a square matrix A is similar to its Jordan 
form that is determined by the standard form of the \-matrix AJ — A. 

To proceed, a A-matriz is a matrix whose entries are complex 
polynomials in A. For instance, 


1 2-2 (A+1) 
5A—1 0 1—2\—\” 
-1 1 \-i 


is a \-matrix, for every entry is a polynomial in A (or a constant). 
We perform operations (i.e., addition and multiplication) on A- 
matrices in the same way as we do for numerical matrices. Of course, 
here the polynomials obey their usual rules of operations. Note that 
division by a nonzero polynomial is not permitted. 
Note that the minimal polynomial of J, of size t, say, in (3.2) is 


m(A) = (A—2y*. 
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Elementary operations on A-matrices are similar to those on nu- 
merical matrices. Elementary A-matrices and invertible A-matrices 
are similarly defined as those of numerical matrices. 

Any square numerical matrix can be brought into a diagonal ma- 
trix with 1 and O on the main diagonal by elementary operations. 
Likewise, \-matrices can be brought into the standard form 


where d;(A)|di4i1(A), 7 = 1,...,4—1, and each d;(A) is 1 or monic. 
Therefore, for any \-matrix A(A\) there exist elementary A-matrices 
P(A), ...,Pi(A) and Qi(A),...,@Q:(A) such that 


P5(A) +++ Pr(A)A(A)Q1(A) ++ Qe(A) = DO) 


is in the standard form (3.3). 

If A(A) is an invertible A-matrix; that is, B(A)A(A) = I for some 
A-matrix B(A) of the same size, then, by taking determinants, we 
see that det A(A) is a nonzero constant. Conversely, if det A(A) is a 
nonzero constant, then (det A(A))~! adj(A(A)) is also a \-matrix and 
it is the inverse of A(A). Moreover, a square A-matrix is invertible if 
and only if its standard form (3.3) is the identity matrix and if and 
only if it is a product of elementary A-matrices. 

For the \-matrix AJ — A, A € M,,, we have k = n and D(A) = 
diag(di(A),...,dn(A)). The d;(A) are called the invariant factors of 
A, and the divisors of d;(A) factored into the form (A — x)! for some 
constant x and positive integer t are the elementary divisors of A. 

To illustrate this, look at the example 
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We perform elementary operations on the A-matrix 


A+1 0 0 
ATI-A= 1 A-1 2 
—3 0 A-—1 


Interchange row 1 and row 2 times —1 to get a 1 for the (1, 1) position: 


1 1-r 2 
At1 0 0 
—3 0 A-1 


Add row 1 times —(A+ 1) and 3 to rows 2 and 3, respectively, to get 
0 below 1: 


1 a 2 
O G=1O-i) 20461) 
6 0-71) A+5 


Add row 3 times 2 to row 2 to get a nonzero number 8: 


1 1% 2 
O V—-GA+5 8 
0 =8(A=1) A+S 


Interchange column 2 and column 3 to get a nonzero number for the 
(2,2) position: 
1 2 1-2 
0 8 -6A+5 
0 A+5 -3(A- 1) 
Subtract the second row times 3(\ +5) from row 3 to get 
1 2 1-A 
0 8 i =O 5 ; 
0 0 —s(A-1)?(A4+1) 


which gives the standard form at once (by column operations) 
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Thus, the invariant factors of A are 
dy(A)=1, do(A)=1, dQ) =(A+DYO- 1), 
and the elementary divisors of A are 
AL, GTP. 
Note that the matrix, a direct sum of two Jordan blocks, 


—-1 0 0 


J={ 011 =(-e(4 7] 
0 01 


has the same invariant factors and elementary divisors as A. 

In general, each elementary divisor (\ — x)’ corresponds to a 
Jordan block in the form (3.2). Consider all the elementary divisors 
of a matrix A, find all the corresponding Jordan blocks, and form a 
direct sum of them. A profound conclusion is that A is similar to this 
direct sum. To this end, we need to show a fundamental theorem. 


Theorem 3.11 Let A and B be n-square complex matrices. Then 
A and B are similar if and only if AI— A and AI — B have the same 
standard form. Equivalently, there exist \-matrices P(A) and Q(A) 
that are products of elementary -matrices such that 


P(A)(AI — A)Q(A) = AI — B. 


This implies our main theorem on the Jordan canonical form, 
which is one of the most useful results in linear algebra and matrix 
theory. The theorem itself is much more important than its proof. 
We sketch the proof of Theorem 3.11 as follows. 


Proof outline. If A and B are similar, then there exists an invertible 
complex matrix P such that PAP~! = B. It follows that 


POT =A)\P? =X =8. 


To show the other way, let P(A) and Q(A) be invertible -matrices 
such that (we put Q(A) for Q(A)~! on the right for convenience) 


P(A)(AI — A) = (AT — B)Q()). (3.4) 
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Write (Problem 4) 
P(A) = QI-B)PLA)+P, QA) = Qi(A)ALT— A) + Q, 


where P; (A), Q1(A) are A-matrices, and P, Q are numerical matrices. 
Identity (3.4) implies P;(A)—@Q1(A) = 0 by considering the degree 
of Py(A) — Q1(A). It follows that Q = P, and thus PA = BP. 
It remains to show that P is invertible. Assume R(A) is the 
inverse of P(X), or P(A) R(A) = I. Write R(A) = (AI — A) Ri (A) +R, 
where F is a numerical matrix. With PA = BP, I = P(A)R(A) gives 


I= (AI — B)T(A)+ PR, (3.5) 
where 
T(A) = PAAJAL — ADR (A) + PA) R + PR). 


By considering the degree of both sides of (3.5), T(A) must be 
zero. Therefore, J = PR and hence P is nonsingular. WH 


Based on the earlier discussions, we conclude our main result. 


Theorem 3.12 (Jordan Decomposition) Let A be a square com- 
plex matrix. Then there exists an invertible matrix P such that 


P?1AP=)0-:-OJz, 


where the J; are the Jordan blocks of A with the eigenvalues of A on 
the diagonal. The Jordan blocks are uniquely determined by A. 


The uniqueness of the Jordan decomposition of A up to permu- 
tations of the diagonal Jordan blocks follows from the uniqueness 
of the standard form (3.3) of AJ — A. Two different sets of Jordan 
blocks will result in two different standard forms (3.3). 

To find the minimal polynomial of a given matrix A € M,, reduce 
AI — A by elementary operations to a standard form with invariant 
factors: d(A),<++5d_ (A); dA) ldpalaA); tor 1 = 1,...,7— 1. Note 
that similar matrices have the same minimal polynomial (Theorem 
3.10). Thus, d,(A) is the minimal polynomial of A, because it is the 
minimal polynomial of the Jordan canonical form of A (Problem 12). 
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Theorem 3.13 Let p(X) and m(A) be, respectively, the characteris- 
tic and minimal polynomials of matriz A € M,,. Let di(A), ..., dn(A) 
be the invariant factors of A, where d;(A)|diz1, i =1,...,n—1. Then 


In the earlier example preceding Theorem 3.11, the characteristic 
and minimal polynomials of A are the same, and they are equal to 


pA) =m) = A412 (A—1)7 =A9 — dX? = dA41. 


Problems 


1. Find the invariant factors, elementary divisors, characteristic and 
minimal polynomials, and the Jordan canonical form of the matrix 


3 1 —3 
A=| -7 -2 9 
—2 -1 4 


2. Show that A and B are similar but not unitarily similar, where 


0 2 0 3 
A=(4 or B=(4 ade 
What is the Jordan canonical form J of A and B? Can one find an 
invertible matrix P such that 


P-'AP=P"!BP=J? 


3. Are the following matrices similar? Why? 
L £0 1 0 0 1 2 0 
O11], 2 1 0 4, 2 1 0 
0 0 1 0 2 1 00 1 


4. Let Ac M,. If P(A) is an n-square -matrix, show that there exist 
a A-matrix S'(A) and a numerical matrix T such that 


POH=O7 =a 27 


[Hint: Write P(A) = A™ Pm + A" 1 Pm—1 +++: + APi + Po] 
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5. Show that a A-matrix is invertible if and only if it is a product of 
elementary A-matrices. 


6. Show that two matrices are similar if and only if they have the same 
set of Jordan blocks, counting the repeated ones. 


7. Find the invariant factors, elementary divisors, and characteristic 
and minimal polynomials for each of the following matrices. 


100 =1. 0 6 01 0 i £0 
011], Oo iby Pw toy. oe tO Dy 
001 0 01 100 011 
i i @ 6 1-1 0 0 0 1 0 
0. 1-th 6 Oo i =f 6 040 1 
OO LA lO @ & at] | Oo oa a 
0001 0 0 0 1 000 A 


8. Find the Jordan canonical form of the matrix 


0 10... 0 

0 0 1... 0 
PS : 

0 0 0 1 

1 0 0 0 


9. Let A be a square complex matrix with invariant factors 
1, AA—2), APR(A—2). 
Answer the following questions. 


) What is the characteristic polynomial of A? 
) What is the minimal polynomial of A? 
) What are the elementary divisors of A? 
d) What is the size of A? 
) What is the rank of A? 
) What is the trace of A? 
) What is the Jordan form of A? 


10. Let J be a Jordan block. Find the Jordan forms of J~! (if it exists) 
and J?. 
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12. 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 
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Show that every Jordan block J is similar to J7 via S: 
Ss a"; 


where S is the backward identity matrix, that is, s;,-;41 = 1 for 
i=1,2,...,n, and 0 elsewhere. 


Show that the last invariant factor d,(A) in the standard form of 
AI — A is the minimal polynomial of A € M,. 


Let p(x) = 2? +Cp_12"—-1 + en_92"-? +--+ +c12+¢9. What are the 
invariant factors of the companion matrix of p(a)? Show that the 
minimal polynomial of the companion matrix of p(x) is p(x). 


If J is a Jordan block such that J? = J, show that J = 1 or 0. 


Let A be an n X n matrix. Show that rank (A?) = rank (A) implies 
rank (A*) = rank (A) for any integer k > 0 and C” = ImA 6 Ker A. 


Let A be ann x n matrix. Show that rank (A*) = rank (A**") for 
some positive integer k < n and that rank (A*) = rank(A™) for all 
positive integer m > k. In particular, rank (A”) = rank (A"*"), 


Show that every matrix A € M,, can be written as A = B+C, where 
C* = 0 for some integer k, B is diagonalizable, and BC = CB. 


Let A be a square complex matrix. If Az = 0 whenever A?x = 0, 
show that A does not have any Jordan block of order more than 1 
corresponding to eigenvalue 0. 


Show that if matrix A has all eigenvalues equal to 1, then A” is 
similar to A for every positive integer k. Discuss the converse. 


Show that the dimension of the vector space of all the polynomials 
in A is equal to the degree of the minimal polynomial of A. 


Let A be an n X n matrix such that A*v = 0 and A*~!v ¥ 0 for 
some vector v and positive integer k. Show that v,Av,...,A*~!v 
are linearly independent. What is the Jordan form of A? 


Let A € M,, be a Jordan block. Show that there exists a vector v 
such that v, Av,...,A”~!v constitute a basis for C”. 


Show that the characteristic polynomial coincides with the minimal 
polynomial for A € M,, if and only if v, Av,...,A"~1v are linearly 
independent for some vector v € C”. What can be said about the 
Jordan form (or Jordan blocks) of A? 
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24. 


25. 


26. 


Let A be an n-square complex matrix. Show that for any nonzero 
vector v € C”, there exists an eigenvector u of A that is contained 
in the span of v, Av, A?v,.... [Hint: v, Av, A?v,..., A*v are linearly 
dependent for some k. Find a related polynomial then factor it out.] 


Let A be an n-square complex matrix with the characteristic poly- 
nomial factored over the complex field C as 


det(AI — A) = (A—Aa)™(A— Ag)" + (A— Ag)", 


where Aj, A2,--.,As are the distinct eigenvalues of A. Show that the 
following statements are equivalent. 


(a) A is diagonalizable; namely, A is similar to a diagonal matrix. 
All the elementary divisors of AJ — A are linear. 


rank (AJ — A) = rank (AI — A)? for every eigenvalue 2. 
rank (cI — A) = rank (cI — A)? for every complex number c. 


(AI — A)x = 0 and (AJ — A)? = 0 have the same solution space 
for every eigenvalue 4. 


) 
) 
d) The minimal polynomial of A has no repeated zeros. 
) 
) 
) 


(h) (cf — A)x = 0 and (cI — A)?x = 0 have the same solution space 
for every complex number c. 


) dimV), =r; for each eigenspace Vy, of eigenvalue Aj. 

) rank (A;J — A) =n —r; for every eigenvalue )j;. 
(k) Im(AI — A) M Ker(AI — A) = {0} for every eigenvalue A. 

) Im(cI — A) 1 Ker(cI — A) = {0} for every complex number c. 
Let A be a linear transformation on a finite-dimensional vector space. 


Let \ be an eigenvalue of A. Show that each subspace Ker(AZ — A)*, 
where k is a positive integer, is invariant under A, and that 


Ker(AZ — A) C Ker(AZ — A)? C Ker(AZ — A)? C---. 


Conclude that for some positive integer m, 
Ker(AZ — A)™ = Ker(AZ — A)™t! =... 


and that 
UP, Ker(AZ — A) = Ker(AZ — A)”. 


© 
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3.5 The Matrices A7, A, A*, ATA, A*A, and AA 


The matrices associated with a matrix A and often encountered are 
A’. A, A, AA, AA, AA, 


where 7, —, and * mean transpose, conjugate, and transpose conju- 
gate, respectively. All these matrices, except A’A, have the same 
rank as A: 


rank (A) = rank (A”) = rank (A) = rank (A*) = rank (A* A). 
The last identity is due to the fact that the equation systems 
(A*A)z =0 and Ar=0 
have the same solution space. 


Theorem 3.14 Let A be an n-square complex matrix. Then 


1. A is similar to its transpose A’. 


2. A is similar to A* (equivalently A) if and only if the Jordan 
blocks of the nonreal eigenvalues of A occur in conjugate pairs. 


3. A*A is similar to AA”. 

4. AA is similar to AA. 
Proof. For (1), recall from Theorem 3.11 that two matrices X and Y 
are similar if and only if AJ — X and AI — Y have the same standard 
form. It is obvious that the matrices AJ — A and I — A™ have the 


same standard form. Thus, A and A” are similar. An alternative 
way to show (1) is to verify that for every Jordan block J, 


SIS += J, 


where S is the backward identity matrix (Problem 11, Section 3.4). 
For (2), let J1,...,J% be the Jordan blocks of A and let 


P'AP=A10-:-Od% 
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for some invertible matrix P. Taking the transpose conjugate gives 
PAP SF Gs OJ,. 
The right-hand side, by (1), is similar to 
FO: OI. 
Thus, if A and A* are similar, then 
AO: -OF and LO: - ok 


are similar. It follows by the uniqueness of Jordan decomposition 
that the Jordan blocks of nonreal complex eigenvalues of A must 
occur in conjugate pairs (Problem 6, Section 3.4). 

For sufficiency, we may consider the special case 


A=JOJOR, (3.6) 
where J and R are Jordan blocks, J is complex, and R is real. Then 
At=J @J OR’, 

which is, by permutation, similar to 
J7OT OR’. (3.7) 


Using (1), (3.6) and (3.7) give the similarity of A* and A. 
(3) is by a singular value decomposition of A. 
We have left to show (4) that AA is similar to AA. It suffices to 
show that AA and AA have the same Jordan decomposition (blocks). 
The matrix identity 


(5 AVENE A) =(8 a) 


gives the similarity of the block matrices 


AA 0 d 0 O 
A cs A AA )° 
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Thus, the nonsingular Jordan blocks of AA and AA are identical. (In 
general, this is true for AB and BA. See Problem 12.) On the other 


hand, the singular Jordan blocks of AA and AA = AA are obviously 
the same. This concludes that 4A and AA are similar. 

Following the discussion of the case where A is similar to A* in 
the proof, one may obtain a more profound result. Consider the 
matrix with Jordan blocks of conjugate pairs 


A 1 0 0 
0 A 0 O 
00 AX 1 |’ 
00 0X 
which is similar via permutation to 
A 0 1 0 
0A01] (cir) 1 
00A0]- e COL? 
00 0X 


where % 
0 
C(A) = ( 02 ) : 
If \=a+ bi with a,b € R, then we have by computation 
i -i d 0 af SO ee 
1 -1 0 X 1 -1 ~ \ -b a }° 


Thus, matrices 


XO: 1 o a b 1 0 
0X 0 1 —§ a 0 1 
00 0 and 00a b 
000A 0 0 -b « 


are similar. These observations lead to the following theorem. 


Theorem 3.15 A square matrix A is similar to A* (equivalently A) 
if and only if A is similar to a real matric. 
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As a result, AA is similar to a real matrix. The ideas of pairing 
Jordan blocks are often used in the similarity theory of matrices. 


Problems 


1. 


Let A be an m x n matrix. Prove or disprove 
(a) rank (A) = rank (A7 A). 
(b) rank (A* A) = rank (AA*). 
(c) ATA is similar to AA’. 


. Is it possible that AA = 0 or A*A = 0 for a nonzero A € M,,? 
. Let 


#0 he Bl eo) 


Compute AB and BA. Find rank (AB) and rank(BA). What are 
the Jordan forms of AB and BA? Are AB and BA similar? 


. If the nonreal eigenvalues of a square matrix A occur in conjugate 


pairs, does it follow that A is similar to A*? 


. Show that the characteristic polynomial of AA has only real coeffi- 


cients. Conclude that the nonreal eigenvalues of AA must occur in 
conjugate pairs. 


. Let A = B+iC € M,, be nonsingular, where B and C are real 


square matrices. Show that A = A~! if and only if BC = CB and 
B? +C? =I. Find the conditions on B and C if AT = A= A7}. 


. Let A € M,,. Show that the following statements are equivalent. 


(a) A is similar to A* (equivalently A). 
(b) The elementary divisors occur in conjugate pairs. 


(c) The invariant factors of A are all real coefficients. 


Are they equivalent to the statement “det(AI— A) is real coefficient” ? 


8. If A € M,, has only real eigenvalues, show that A is similar to A*. 


9. Let A be a nonsingular matrix. When is A~! similar to A? 


10. 


Let Ae M, and B be m x n. Let M be the (m+ n)-square matrix 


A 0 
ua(4 2°). 


Show that the nonsingular Jordan blocks of A and M are identical. 
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11. 


12. 


13. 


14. 


15. 
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Let A, B, C, and D be n-square complex matrices. If the matrices 


A B F C D 
oo) 0 0 
are similar, does it follow that A and C are similar? 


Let A and B” be mxn complex matrices. Show that the nonsingular 
Jordan blocks of AB and BA are identical. Conclude that AB and 
BA have the same nonzero eigenvalues, including multiplicity. 


If A and B € M, have no common eigenvalues, show that the fol- 
lowing two block matrices are similar for any X € M,,: 


A X A 0 
0 BY)’ 0 Bi)’ 
Let A, B, and C be matrices of appropriate sizes. Show that 


AX -YB=C 


for some matrices X and Y if and only if the block matrices 


AC a4 A O 
0 B . 0 B 
have the same rank. 


Let A and B be n-square complex matrices and let 


A B 
me), 


Show that 


(a) The characteristic polynomial of M is of real coefficients. 

(b) The eigenvalues of M occur in conjugate pairs with eigenvectors 
in forms (;) € C2” and (—¥) € C?”. 

(c) The eigenvectors in (b) are linearly independent. 

(d) det M > 0. In particular, det([ + 4A) > 0 for any A € My. 


CHAPTER 4 


Numerical Ranges, Matrix Norms, and 
Special Operations 


Introduction: This chapter is devoted to a few basic topics on matri- 
ces. We first study the numerical range and radius of a square matrix 
and matrix norms. We then introduce three important special ma- 
trix operations: the Kronecker product, the Hadamard product, and 
compound matrices. 


4.1 Numerical Range and Radius 


Let A be an n x n complex matrix. For 7 = (%j,...,%)" € C”, 
as usual, ||x|] = (37%, |ai|?)'/? is the norm of x (Section 1.4). The 
numerical range, also known as the field of values, of A is defined by 


W(A) = {2* Az: |x|] =1, 2 eC}. 


1 O 
4=(9 0) 
then W(A) is the closed interval [0,1], and if 
0 0 
Cee. 
then W(A) is the closed elliptical disc with foci at (0,0) and (1,0), 
minor axis 1, and major axis 2 (Problem 9). 


For example, if 
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One of the celebrated and fundamental results on numerical range 
is the Toeplitz—Hausdorff convexity theorem. 


Theorem 4.1 (Toeplitz—Hausdorff) The numerical range of a square 
matrix is a convex compact subset of the complex plane. 


Proof. For convexity, if W(A) is a singleton, there is nothing to 

show. Suppose W(A) has more than one point. We prove that the 

line segment joining any two distinct points in W(A) lies in W(A); 

that is, if u,v € W(A), then tu + (1 —t)v € W(A) for all t € (0, 1]. 
For any complex numbers a and {, it is easy to verify that 


W(al+ BA) ={a+ Bz: z€ W(A)}. 


Intuitively the convexity of W(A) does not change under shifting, 
scaling, and rotation. Thus, we may assume that the two points to 
be considered are 0 and 1, and show that [0,1] C W(A). Write 


A=H+ik, 


where 1 1 
H= g(A+ 4’) and K = 3 (A- 4) 
are Hermitian matrices. Let x and y be unit vectors in C” such that 
pAg=0, » Ay=—1. 
It follows that x and y are linearly independent and that 
fHhra=akKn=yKy=0,. ¢iy= 1. 


We may further assume that x* Ky has real part zero; otherwise, one 
may replace x with cx, c € C, and |c| = 1, so that cx*Ky is 0 or a 
pure complex number without changing the value of x* Az. 

Note that tz + (1 —t)y £0, t € [0,1]. Define for t € [0, 1] 


1 
[te + (1 — yl? 


z(t) (tx + (1 — t)y). 


Then z(t) is a unit vector. It is easy to compute that for all t € [0, 1] 


a(t) Ket) =0. 
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The convexity of W(A) then follows, for 
{2 Acti}: O< F< 1} = (0,1), 


The compactness of W(A), meaning the boundary is contained 
in W(A), is seen by noting that W(A) is the range of the continuous 
function + +» «* Az on the compact set {c € C”: ||z|| = 1}. (A 
continuous function maps a compact set to a compact set.) I 

When considering the smallest disc centered at the origin that 
covers the numerical range, we associate with W(A) a number 


w(A) = sup{|z|: 2 € W(A)} = sup |z*Az| 
|| |]=1 


and call it the numerical radius of A € M,,. Note that the “sup” can 
be attained by some z € W(A). It is immediate that for any x € C” 


|a* Az| < w(A)|la||?. (4.1) 


We now make comparisons of the numerical radius w(A) to the 
largest eigenvalue p(A) in absolute value, or the spectral radius, i.e., 


p(A) = max{|A|: A is an eigenvalue of A}, 


and to the largest singular value omax(A), also called the spectral 
norm. It is easy to see (Problem 7) that 


Ax 
Omax(A) = sup ||Az|| = sup Ax 
\jx|)=1 «#0 ||2| 


and that for every x € C” 
|| Az|] < omax(A)||21]. 
Theorem 4.2 Let A be a square complex matrix. Then 
p(A) < w(A) S oOmax(A) < 2w(A). 


Proof. Let \ be the eigenvalue of A such that p(A) = |A], and let u 
be a unit eigenvector corresponding to A. Then 


p(A) = |Au*u| = |u* Au < w(A). 
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The second inequality follows from the Cauchy—Schwarz inequality 
|x* Az| = |(Az, x)| < ||Az|l|lz|]- 
We next show that omax(A) < 2w(A). It can be verified that 


4(Az,y) = (Ae +y),2+y) (Aw y),x y) 
+ i(A(w + ty), + ty) —i(A@ — iy), x — iy). 


Using (4.1), it follows that 


4\(Az,y)| < w(A)(l|e + yl? + Ile — yl? 
+ lle + iyll? + lla — éyl|?) 
= 4w(A)((|2\|? + lyll?). 


Thus, for any unit x and y in C”, we have 
|(Ax, y)| < 2w(A). 


The inequality follows immediately from Problem 7. & 


Theorem 4.3 Let A € M,. Then limp... A* = 0 if and only if 
p(A) <1; that is, all the eigenvalues of A have moduli less than 1. 


Proof. Let A= P~!TP bea Jordan decomposition of A, where P is 
invertible and T is a direct sum of Jordan blocks with the eigenvalues 
Ay,...,An of A on the main diagonal. Then A*® = P-!T*P and 
p(A*) = (p(A))*. Thus, if A* tends to zero, so does T*. It follows 
that \* > 0 as k > 0 for every eigenvalue of A. Therefore p(A) < 1. 
Conversely, suppose p(A) < 1. We show that A* > 0 as k > oo. It 
suffices to show that J* > 0 as k > oo for each Jordan block J. 
Suppose J is an m x m Jordan block: 


A 1... 0 
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Upon computation, we have 


MATT a (geo 
geo} O MO! 

- %* (ar 

0 ae 0 Pa 


Recall from calculus that for any constants / and A < 1 


lim (‘) de =0. 
k- 00 l 


It follows that J*, thus A*, converges to0 ask— oo. I 


Problems 


1. Find a nonzero matrix A so that p(A) = 0. 


2. Find the eigenvalues, singular values, numerical radius, spectral ra- 
dius, spectral norm, and numerical range for each of the following: 


0 1 1 1 1 1 
1 0)? 0 1}? I A 
3. Let A be an n-square complex matrix. Show that the numerical 
radius, spectral radius, spectral norm, and numerical range are uni- 


tarily invariant. That is, for instance, w(U*AU) = w(A) for any 
n-square unitary matrix U. 


4. Show that the diagonal entries and the eigenvalues of a square matrix 
are contained in the numerical range of the matrix. 


5. Let A ¢ M,,. Show that +trA is contained in W(A). Conclude that 
for any nonsingular P € M,,, W(P~!AP — PAP~*) contains 0. 


6. Let A be a square complex matrix. Show that 1 is constant for 


all « # 0 if and only if all the singular values of A are identical. 


7. Let A be a complex matrix. Show that 


Ax 
Cmax(A) = /o(A*A) = sup ||Acl| =eup 42 sup [(4ez,y)|. 
\j2||=1 240 |l2||—|Jz||=I1y|=1 
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10. 


11. 


12. 


13. 


14. 


15. 
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. Show that for any square matrices A and B of the same size, 


Omax(AB) < Omax (A)omax(B) ’ 


and 
Omax(A + B) < Omax(A) + Omax(B). 


. Show that the numerical range of (. a is a closed elliptical disc. 
Take 
0 1 0 0 
0 0 1 0 
an 0 0 0 1 
0 0 0 0 


and let B = A?. Show that w(A) < 1. Find w(B) and w(AB). 


Show that the numerical range of a normal matrix is the convex 
hull of its eigenvalues. That is, if A € M, is a normal matrix with 
eigenvalues A1,...,An, then 


WA) = {ie $2 ting f os iy = 1, each tS Of. 


Show that W(A) is a polygon inscribed in the unit circle if A is 
unitary, and that W(A) C R if A is Hermitian. What can be said 
about W(A) if A is positive semidefinite? 


Show that w(A) = p(A) = omax(A) if A is normal. Discuss the 
converse by considering 
: ; ; 0 1 
A = diag(1,i,-—1,-i) ® ( 0 0 ) : 

Prove or disprove that for any n-square complex matrices A and B 

(a) p(AB) < p(A)p(B). 

(b) w(AB) < w(A)w(B). 

(c) Omax(AB) < Omax(A)Omax(B). 
Let A be a square matrix. Show that for every positive integer k 

w(A*) < (w(A))*. 

Is it true in general that 


w(ARt™) < w(A*) w(A™)? 
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4.2 Matrix Norms 


A matrix may be assigned numerical items in various ways. In ad- 
dition to determinant, trace, eigenvalues, singular values, numerical 
radius, and spectral radius, matrix norm is another important one. 
Recall from Section 1.4 of Chapter 1 that vectors can be measured 
by their norms. If V is an inner product space, then the norm of a 


vector v in V is ||v|| = ./(v,v). The norm || - || on V satisfies 
i. ||v|| > 0 with equality if and only if v = 0, 
ii. ||cv|| < |e|||v|| for all scalars c and vectors v, and 
iii. ||w + v|| < |Jul] + ||v|| for all vectors u, v. 
Like the norms for vectors being introduced to measure the mag- 
nitudes of vectors, norms for matrices are used to measure the “sizes” 


of matrices. We call a matrix function || - || : MM, Ra matriz norm 
if for all A, B € M, and c € C, the following conditions are satisfied: 


1. ||A|| > 0 with equality if and only if A = 0, 
2. ||cAll < |ell|All, 
3. ||A+ Bl] < ||Al| + ||B]], and 
4. ||AB|| < ||All|| Bll. 
We call || - || for matrices satisfying (1)—(3) a matriz-vector norm. 


In this book by a matrix norm we mean that all conditions (1)—(4) are 
met. Such a matrix norm is sometimes referred to as a multiplicative 
matriz norm. We use the notation || - || for both vector norm and 
matrix norm. Generally speaking, this won’t cause confusion as one 
can easily tell from what is being studied. 

If a matrix is considered as a linear operator on an inner product 


space V, a matrix operator norm || - ||op can be induced as follows: 
|| Az| 
||Allop = sup = sup ||Az||. 
«#0 |[2l] — ayj=1 
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From the previous section (Problems 7 and 8, Section 4.1), we see 
that the spectral norm is a matrix (operator) norm on M,, induced 
by the ordinary inner product on C”. 

Matrices can be viewed as vectors in the matrix space M_,, equipped 
with the inner product (A, B) = tr(B* A). Matrices as vectors under 
the inner product have vector norms. One may check that this vector 
norm for matrices is also a (multiplicative) matrix norm. 

Two observations on matrix norms follow: first, a matrix norm 
|| - || : A + ||Al] is a continuous function on the matrix space M,, 
(Problem 2), and second, p(-) < ||- || for any matrix norm || - ||. 
Reason: If Ax = Ax, where x 4 0 and p(A) = |A\, then, by letting 
X be the n x n matrix with all columns equal to the eigenvector z, 


p(A)||X]] = AX] = AX] < ATMAT 


Nevertheless, the numerical radius p(-) is not a matrix norm. (Why?) 
The following result reveals a relation between the two. 


Theorem 4.4 Let || - || be a matrix norm. Then for every A € Mn 
p(A) = Jim |A*I)/®, 
k- oo 
Proof. The eigenvalues of A* are the kth powers of those of A. Since 
spectral radius is dominated by norm, for every positive integer k, 
(o(A))* = p(A*) < ||A*|| or p(A) < ||A*IY*. 


On the other hand, for any € > 0, let 


A, = ———A 
p(A) +e 


Then p(A-) < 1. By Theorem 4.3, A* tends to 0 as k > oo. Thus, 
Because the norm is a continuous function, for k large enough, 


|AEI| <1 or ||A*l| < (oA) +)¥. 


Therefore 
JAR < p(A) +e. 
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In summary, for any € > 0 and k large enough 
p(A) < ||A*I|/* < p(A) +6. 


The conclusion follows immediately by letting « approach 0. 


Now we turn our attention to an important class of matrix norms: 
unitarily invariant norms. We say a matrix (vector) norm ||-|| on M,, 
is unitarily invariant if for any A € M,, and for all unitary U,V ¢ M, 


|UAV || = [AI]. 


The spectral norm omax : At? Omax(A) is a matrix norm and it 
is unitarily invariant because Omax(UAV) = Omax(A). The Frobenius 
norm (also known as the Euclidean norm or Hilbert-Schmidt norm) 
is the matrix norm induced by the inner product (A, B) = tr(B* A) 
on the matrix space M,, 


lle = (u(r)? = (Slat) 2) 


ij=l 
With o;(A) denoting the singular values of A, we see that 


n 


Alle = (Soo?) 


i=1 


Thus || Al||- is uniquely determined by the singular values of A. Con- 
sequently, the Frobenius norm is a unitarily invariant matrix norm. 
Actually, the spectral norm and the Frobenius norm belong to 
two larger families of unitarily invariant norms: the Ky Fan k-norms 
and the Schatten p-norms, which we study more in Chapter 10. 
Ky Fan k-norm: Let k < n be a positive integer. Define 


Alla = Soa ), ACMp. 


Schatten p-norm: Let p> 1 be a real number. Define 


n 1/p 
Ally = (2274) | AGM. 
t=1 
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It is readily seen that the spectral norm is the Ky Fan norm when 
k = 1, it also equals the limit of || A||, as p — 00, i.e., || A]].., whereas 
the Frobenius norm is the Schatten 2-norm, i.e., ||A||# = ||All2. 


Problems 
1. Let || - || be a vector norm on C” (or R”). Define 
||z||? = max{|(x,y)| sy eC”, |lyl] = 1. 
Show that || - ||? is a vector norm on C” (known as the dual norm). 


2. Show that for any matrix norm ||- || on M, and A = (a;;),B € My 
[Al] - Bll] <|4- Bll and |Al| < So lass Esl, 
inj 
where F;; is the matrix with (i, j)-entry 1 and elsewhere 0 for all i, 7. 
3. Let A,B Ee M,. Show that 


A+ Ble <||Alle+||Bllz and ||ABl|r < ||Allr||Bllr. 


4. Let A € M,,. Show that for any matrix norm || - |] and integer k > 1, 
| A*|| < [Al] and |[A*||~*|Z|| < ||A7?||* if A is invertible. 
5. Let A € M,, be given. If there exists a matrix norm |] - || such that 
|| All < 1, show that A* + 0 as k > 0. 


6. Let || - || be a matrix norm on M,,. Show that for any invertible 
matrix P € Mn, ||- ||p : Mn +3 R defined by ||A||p = ||P~1AP|| for 
all matrices A € M,, is also a matrix norm. 

7. Let A= (aij) € M,, and define | Alloo = MAaX1<i, j<n |ax,|. Show that 
||-|l0o is a matrix-vector norm, but not a multiplicative matrix norm. 


8. Show that || - |ls, || - |]1, and |] - |l.. are matrix norms on M,,, where 
n 
Alls =nllAlloo, Alla = S7 leigh, MAlleo = max > 7 laisl. 
ee i<n¢ 
1<i,j<n ee —l 


Are these (multiplicative) matrix norms unitarily invariant? 
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4.3. The Kronecker and Hadamard Products 


Matrices can be multiplied in different ways. The Kronecker prod- 
uct and Hadamard product, defined below, used in many fields, are 
almost as important as the ordinary product. Another basic ma- 
trix operation is “compounding” matrices, which is evidently a use- 
ful tool in deriving matrix inequalities. This section introduces the 
three concepts and presents their properties. 

The Kronecker product, also known as tensor product or direct 
product, of two matrices A and B of sizes mxn and sx t, respectively, 
is defined to be the (ms) x (nt) matrix 


ai,B aj42B vee A1nB 

a21B a22B see A2,B 
A@B= 

Ami B Am2B eas operon i 


In other words, the Kronecker product A®B is an (ms) x (nt) matrix, 
partitioned into mn blocks with the (7,7) block the s x t matrix a,j B. 
Note that A and B can have any different sizes. 

The Hadamard product, or the Schur product, of two matrices A 
and B of the same size is defined to be the entrywise product 


AoB= (agibaz): 


In particular, for a= (wi, tis225ty); B= 15 0950.24 0,) © CC”, 
UO US (Oy 226g Wig ctns UU iad aha) 
and 
COU = (W101, U5. 22 5 Una): 


Note that A® B #4 B® A in general and Ao B= BoA. 
We take, for example, A = ee and B= Car Then 


a b 2a 2b 
c dad 2c 2d a 2b 
492 = 3a 3b 4a 4b ]’ AoB=( 3 a 


3c 3d 4c 4d 
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The Kronecker product has the following basic properties, each 
of which can be verified by definition and direct computations. 


Theorem 4.5 Let A, B, C be matrices of appropriate sizes. Then 
1. (kA) ® B= AQ(kKB) =k(A@B), where k is a scalar. 
. (A+ B)Q2C=AQC+HBEOC. 


-A®(B+C)=AQBIAQC. 
. (A®B)Q2C=AQ(BOQC). 
A®B=0 #f and only if A=0 or B=0. 


(A@ B)? = A’ @B". If A and B are symmetric, so is A@ B. 


rR 2S Ar wS 


(A ® B)* = A* @ B*. If A and B are Hermitian, so is A® B. 


Theorem 4.6 Let A, B, C be matrices of appropriate sizes. Then 
1. (A® B)(C®@ D) = (AC) @ (BD). 


2. (A@B)!=A!@B! Gf A and B are invertible. 
3. A®B is unitary if A and B are unitary. 


4. A®B is normal if A and B are normal. 


Proof. For (1), let A have n columns. Then C has n rows as indicated 
in the product AC on the right-hand side of (1). We write A@ B= 
(ajjB), C® D = (qj;D). Then the (1,7) block of (A ® B)(C'@ D) is 


n n 
) ait Bey; D = ) AitCey BD. 
t=1 t=1 


But this is the (i, j)-entry of AC times BD, which is the (7,7) block 
of (AC) @ (BD). (1) follows. The rest are immediate from (1). 


To perform the Hadamard product, matrices need to have the 
same size. In the case of square matrices, an interesting and impor- 
tant observation is that the Hadamard product Ao B is contained in 
the Kronecker product A ® B as a principal submatrix. 
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Theorem 4.7 Let A,B © M,. Then the Hadamard product Ao B 


is a principal submatrix of the Kronecker product A® B lying on the 


intersections of rows and columns 1, n+ 2, 2n+3,...,n?. 


Proof. Let e; be, as usual, the column vector of n components with 
the ith position 1 and 0 elsewhere, i = 1,2,...,n, and let 


E = (€1 ® e1,...,€n @ €n). 
Then for every pair of 7 and 7, we have by computation 
aijbiz = (e; Aej) ® (e; Be;) = (ei @ e:)"(A ® B)(e; @ es), 
which equals the (7, 7)-entry of the matrix E7(A ® B)E. Thus, 
ET(A@ B)E = AoB. 


This says that Ao B is the principal submatrix of A @ B lying on 
the intersections of rows and columns 1, n+ 2, 2n+3,...,n?. 


The following theorem, relating the eigenvalues of the Kronecker 
product to those of individual matrices, presents in its proof a com- 
mon method of decomposing a Kronecker product. 


Theorem 4.8 Let A and B be m-square and n-square complex ma- 
trices with eigenvalues 4; and wj,i=1,...,m, j =1,...,n, respec- 
tively. Then the eigenvalues of A® B are 


Nibj, t= yong Ms 4 = ney 
and the eigenvalues of A® In, +Im® B are 
Mp, t=1,...,m, JH1,...,n. 


Proof. By the Schur decomposition (Theorem 3.3), let U and V be 
unitary matrices of sizes m and n, respectively, such that 


U*AU=T, and V*BV=T), 


where 7; and T> are upper-triangular matrices with diagonal entries 
Ay and pj, i=1,...,m, 7 =1,...,n, respectively. Then 


T, ® Tp = (U* AU) @ (V*BV) = (U* @V*)(A@B)(U @V). 
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Note that U®V is unitary. Thus A® B is unitarily similar to T; ®@7). 
The eigenvalues of the latter matrix are \j1;. 
For the second part, let W =U @V. Then 


Aln * 
W*(A@L)W =1, 81, = 
0 Amtn 

and 

T» 0 

Wma BW =1y@ 1 = 

0 T» 
Thus 
is an upper-triangular matrix with eigenvalues A; + 14;. 
Problems 


1. Compute A@ B and B® A for 


0 v2 
A= ( . : ) , B= nr 2 ; 
-1 7 
2. Let Jo=(j} 1). Compute I, ® Jo and Jo ® In. 
3. Let A, B, C, and D be complex matrices. Show that 
(a) (A@ B)*F = A* @ BF. 
(b) tr(A® B) =trAtrB. 
(c) rank (A ® B) = rank (A) rank (B). 
(d) det(A ® B) = (det A)"(det B)™, if Ac M,, and B € My. 


(e) If A® B=C@D £0, where A and C are of the same size, 
then A = aC and B = bD with ab = 1, and vice versa. 


) 
) 
4. Let A and B be m- and n-square matrices, respectively. Show that 


(A ®@ In)Um ® B) = A® B= (Im ® B)(A® I). 
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5. Let A € M,, have characteristic polynomial p. Show that 
det(A @ I+ 1@ A) = (—1)" det p(—A). 
6. Let A,B € M,,. Show that for some permutation matrix P € M,,2 
P~1(A@ B)P=B@A. 
7. Let z, y, u, v€C”. With (2, y) = y*z and ||z||? = z*z, show that 
(x, y)(u, v) = (e@@u, y@v). 


Derive 
Iz ® yl = |x|] [lyll- 


8. Let A, Be M,,. Show that Ao J, = diag(ai1,...,@nn) and that 
D,(Ao B)Dz = (Di AD2) 0 B = Ao (Di BDz) 
for any n-square diagonal matrices D, and D2. 
9. Let A, B, and C be square matrices. Show that 
(ASB)@®C=(A®C)S(BO@C). 
But it need not be true that 
(A®B)®C=(AGC)@(BOC). 
10. Consider the vector space Mz, 2 x 2 complex matrices, over C. 


(a) What is the dimension of M,? 
(b) Find a basis for My. 
(c) For A, B € Mp, define 
L(X)=AXB, X € Mb. 


Show that £ is a linear transformation on Mb. 


(d) Show that if \ and y are eigenvalues of A and B, respectively, 
then Ay is an eigenvalue of L. 


11. Let A and B be square matrices (of possibly different sizes). Show that 


e428! — Aggy, e183 — T @e?, e408 — A @eF, 
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4.4 Compound Matrices 


We now turn our attention to compound matrices. Roughly speaking, 
the Kronecker product and the Hadamard product are operations 
on two (or more) matrices. Unlike the Kronecker and Hadamard 
products, “compounding” matrix is a matrix operation on a single 
matrix that arranges in certain order all minors of a given size from 
the given matrix. A rigorous definition is given as follows. 

Let A be an m x n matrix, a = {t1,...,is}, and 8 = {j1,..., je}, 
1l< ip <ee <ig S ml < fb << je <n. Denote by 
Alti,---;%s;J1,--+;Jt], or simply Ala, 8], the submatrix of A consist- 
ing of the entries in rows 71,...,z7, and columns j1,..., Jt. 

Given a positive integer k < min{m,n}, there are ie x (2) pos- 
sible minors (numbers) that we can get from the m x n matrix A. 
We now form a matrix, denoted by A“) and called the kth compound 
matrix of A, of size GC) x (2) by ordering these numbers lexicographi- 
cally; that is, the (1,1) position of A“) is det A[1,...,k|1,...,k], the 
(1,2) position of A® is det A[l,...,k|1,...,k—-1,k+1],..., whereas 
the (2,1) position is det A[1,...,k-—1,k+1]1,...,k], ..., and so on. 
For convenience, we say that the minor det A[a, (] is in the (a, 3) 
position of the compound matrix and denote it by AT: Clearly, 
AQ) = A and A™ = det A if A is an n x n matrix. 


As an example, let m = n = 3, k = 2, and take 


1, 2: 3 
A=; 4 5 6 
7 8 9 


det A[1,2|1,2] det A[1,2|1,3] det A[1, 2|2, 3] 
A®?) = | det A[1,3]1,2] det A[1,3]1,3] det A[1, 3]2, 3] 
det A[2,3]1,2] det A[2,3]1,3] det A[2, 3]2, 3] 
12 13 23 
lag la6 lea —3 -6 -3 
12 13 23 
= Ee l79 leg = —6 —12 —6 
Ee es Ee —-3 -6 -3 
78 79 89 
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If A is an n-square matrix, then the main diagonal entries of A‘) 
are det A[a|a], i-e., the principal minors of A. For an n-square upper- 
triangular matrix A, det Ala|6] = 0 if a is after @ in lexicographic 
order. This leads to the result that if A is upper (lower)-triangular, 
then so is A‘). As a consequence, if A is diagonal, then so is A(®), 

The goal of this section is to show that the compound matrix of 
the product of matrices is the product of their compound matrices. 
For this purpose, we need to borrow a well-known result on deter- 
minant expansion, the Binet-Cauchy formula. (A good reference on 
this formula and its proof is Lancaster and Tismenetsky’s book, The 
Theory of Matrices, 1985, pp. 36-42.) 


Theorem 4.9 (Binet—Cauchy formula) Let C = AB, where A 
ism x none Bien xm, mi <n, and leba = {1,2,...,m}. Then 


det C = $° det A[a| A] det B[Bla], 
B 


where 8 runs over all sequences {j1,..-,Jjm}, 1 <1 < +++ <Jm <n. 


The following theorem is of fundamental importance for com- 
pound matrices, whose corollary plays a pivotal role in deriving ma- 
trix inequalities involving eigenvalue and singular value products. 


Theorem 4.10 Let A be anm x p matrix and B be ap x n matrix. 
If k is a positive integer, k < min{m, p,n}, then (AB) = AM BO), 


Proof. For a = Lityeaicy teh and B _ Witeeee el te 1< ay eae 
ip <m,1< ji <-+++ < jp < mM, we compute the entry in the (a, 3) 
position (in lexicographic order) of (AB) by the Binet-Cauchy 
determinant expansion formula and get 


(AB), = det ((AB)[a|§]) 


a,B 
S~ det Alay] det Bly] = (A® BM), 4, 
2 


where y runs over all possible sequences 1 <7 <---<9,<p. UO 


If A is Hermitian, it is readily seen that A“) is Hermitian too. 
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Corollary 4.1 Let A € M,, be a positive semidefinite matrix with 
eigenvalues \1(A) > ++: > An(A). Then the largest eigenvalue of 
A‘*) is the product of the first k a eigenvalues of A; that is, 


Ninel A) -[Pu 


Problems 


1. Find A), where 


2. Show that A*[a|G] = (A[Gla])*. 
3. Show that ry _ = Tin ) where I; is the / x | identity matrix. 


. Show that (A(®))* = (A*); (A(®))? = (AT)® 

. Show that (A())-! = (A-1)) if A is nonsingular. 

. Show that det A“) = (det A) (G1) when A is n-square. 

. If rank (A) =r, show that rank (A“)) = (7) or 0 ifr <k. 


. Show that if A is unitary, symmetric, positive (semi-)definite, Her- 
mitian, or normal, then so is A‘), respectively. 


9. If A = diag(a,,...,a,), show that A“) is an (:) x (;) diagonal 


matrix with diagonal entries a;,---a;,, 1<t<-++:<am<n. 


on DoD oO ® 


10. If Ac M,, has eigenvalues A1,..., An, Show that A®) has eigenvalues 
Ni igs re a 

11. If Ac M, has singular values oj,...,0n, show that A) has singular 
values 0), +°+0;,, 1 < ty < +++ << tye <n. 


12. If Ae has eigenvalues \j,...,An, show that tr(A“)) equals 
oe Ni, ++ Ai,, denoted by 8, (A1,---,An) and called kth elementary 
eee function, where y is any sequence 1 <7) <-+- <a, <n. 


13. If A is a positive semidefinite matrix with eigenvalues A; >--- > An, 
show that the smallest eigenvalue of A‘*) is Amin A) = ig ae An—i41: 
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CHAPTER 5 


Special Types of Matrices 


Introduction: This chapter studies special types of matrices. They 
are: idempotent matrices, nilpotent matrices, involutary matrices, 
projection matrices, tridiagonal matrices, circulant matrices, Vander- 
monde matrices, Hadamard matrices, permutation matrices, doubly 
stochastic matrices, and nonnegative matrices. These matrices are 
often used in many subjects of mathematics and in other fields. 


5.1 Idempotence, Nilpotence, Involution, and Projections 


We first present three types of matrices that have simple structures 

under similarity: idempotent matrices, nilpotent matrices, and invo- 

lutions. We then turn attention to orthogonal projection matrices. 
A square matrix A is said to be idempotent, or a projection, if 


A? = A, 


nilpotent if for some positive integer k 


A® =0, 
and involutary if 
APS, 
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Theorem 5.1 Let A be an n-square complex matrix. Then 


1. A is idempotent if and only if A is similar to a diagonal matrix 
of the form diag(1,...,1,0,...,0). 
2. A is nilpotent if and only if all the eigenvalues of A are zero. 
3. A is involutary if and only if A is similar to a diagonal matrix 
of the form diag(1,...,1,—1,...,—1). 
Proof. The sufficiency in (1) is obvious. To see the necessity, let 
A=P (J, 6-:+@ Jk)P 
be a Jordan decomposition of A. Then for each 7,i=1,...,k, 
AaA Ss FSi 


Observe that if J is a Jordan block and if J? = J, then J must be 
of size 1; that is, J is a number. The assertion then follows. 
For (2), consider the Schur (or Jordan) decomposition of A, 


Aq * 
A=U"! a U. 
0 An 
where U is an n-square unitary matrix. 
If A* = 0, then each rE = 0, and A has only zero eigenvalues. 
Conversely, it is easy to verify by computation that A” = 0 if all the 


eigenvalues of A are equal to zero (see also Problem 11, Section 3.3). 
The proof of (3) is similar to that of (1). 


Theorem 5.2 Let A and B be nilpotent matrices of the same size. 
If A and B commute, then A+ B is nilpotent. 


Proof. Let A™ = 0 and B” = 0. Upon computation, we have 
(A+ By"" <0, 


for each term in the expansion of (A+ B)™*" is A™*, is B™*™, or 
contains A°B*, s >mort>n. In any case, every term vanishes. Il 
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By choosing a suitable basis for C”, we can interpret Theo- 
rem 5.1(1) as follows. A matrix A is a projection if and only if 
C” can be decomposed as 


C°=W1 8 Wa, (5.1) 
where W; and W2 are subspaces such that for all wy € W1, wo € Wa, 
Aw = U1, Aw2 = 0. 

Thus, if w = w, + wo € C”, where w, € W, and wa € Wo, then 
Aw = Aw, + Awe = w}. 
Such a wy is called the projection of w on W,. Note that 
W,=ImA, W.=KerA=Im(I- 4A). 
Using this and Theorem 5.1(1), one may prove the next result. 
Theorem 5.3 For any A€ M, the following are equivalent. 
1. A is a projection matrix; that is, A? = A. 
2. C®=ImA+ KerA with Ax = x for every x € Im A. 
3. Ker A = Im(I — A). 
4. rank (A) + rank (f— A) =n. 
5. Im An Im(I — A) = {0}. 


We now turn our attention to orthogonal projection matrices. 
A square complex matrix A is called an orthogonal projection if 


AA=A=A’. 
For orthogonal projection matrices, the subspaces 
W,=ImA and W.=Im(I- 4A) 
in (5.1) are orthogonal; that is, for all w, € W; and w2 € Wa, 
(wi, we) = 0. (5.2) 
In other words, (Az, (I — A)x) = 0 for all x € C”; this is because 


(wy, w2) = (Awy, w2) => (w1, A* we) = (w1, Awe) = 0. 
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Theorem 5.4 For any A€ M,, the following are equivalent. 


. A is an orthogonal projection matriz; that is, A? = A = A*. 
. A=U* diag(1,...,1,0,...,0)U for some unitary matrix U. 
. |lx — Az|| < ||a — Ay|| for every x and y in C”. 

A? = A and ||Az|| < ||x|| for every x € C”. 

_A=A*A. 


SY Ww WK 


Proof. (1)<(2): We show (1)=(2). The other direction is obvious. 

Because A is Hermitian, by the spectral decomposition theorem 
(Theorem 3.4), we have A = V* diag(Aj,...,An)V for some unitary 
matrix V, where the A; are the eigenvalues of A. However, A is 
idempotent and thus has only eigenvalues 1 and 0 according to the 
previous theorem. It follows that 


—_—— 
A=U* diag(1,...,1,0,...,0)U, 


where r is the rank of A and U is some unitary matrix. 
(1)(3): For (1)=(3), let A be an orthogonal projection. We 
have the decomposition (5.1) with the orthogonality condition (5.2). 
Let x = x1 + %2, where 21 € Wi, ro € Wo, and (21,22) = 0. 
Similarly, write y = y; + y2. Note that x1; — y, © W; and W,1LW2. 
Since (u,v) = 0 implies ||u||? + ||v||? = |}u + v]]?, we have 


||e—Aa|? = ||xal|? < ||xal|?+|}21-y2||? = |!e2+(a1-y1) |? = |]2—Ayl|?. 


We now show (3)=(1). It is sufficient to show that the decompo- 
sition (5.1) with the orthogonality condition (5.2) holds, where Im A 
serves as W; and Im(J — A) as W2. 

As x = Ax + (I — A)x for every x € C”, it is obvious that 


C” =ImA+Im(J — A). 


We have left to show that (x,y) = 0 if  € ImA and y € Im({J — A). 
Suppose instead that ((J — A)x, Ay) 4 0 for some xz and y € C”. We 
show that there exists a vector z € C” such that 


lla — Az|| < lla — Aa, 
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which is a contradiction to the given condition (3). 
Let ((I — A)z, Ay) = a # 0. We may assume that a < 0. 
Otherwise, replace x with e’’x, where 0 € R is such that ea < 0. 
Let z =x —ey, where « > 0. Then 


jn — Azd|? = [\(e— An) + (Aw Axe)? 
= |\a — Aa||? + ||Ax — Az,||? 

+ 2Re((I — A)z, A(x — 2)) 

= |lx — Ax||? + || Ax — Az,||? 

+ 2e((I — A)a, Ay) 

= |la — Ax||? + e?|| Ayl|? + 2ea. 


Because a < 0, we have €?||Ay||? + 2ev < 0 for some € small 
enough, which results in a contradiction to the assumption in (3): 


|Ia — Az,_|| < |lJa — Axl. 


(1)=(4): If A is an orthogonal projection matrix, then the or- 
thogonality condition (5.2) holds. Thus, (Az, (I — A)x) = 0 and 


|| Az||? < |[Az|? + |Z — A)al|? = ||Ax + (F — A) |)? = Ila)”. 


(4)=>(5): If A # A*A; that is, (A* — I)A #0 or A*(I — A) £0, 
then rank (I — A) < n and dimIm(J — A) < n by Theorem 5.1(1). 
We show that there exists a nonzero x such that 


(z,(I-A)x)=0, but (I-A)x £0. 
Thus, for this x, 
|| Aa||? = |lx — (I — A)a|? = |Ix||? + |Z — Ada)? > [le l?, 


which contradicts the condition ||Az|| < ||x||] for every x € C”. 

To show the existence of such a vector x, it is sufficient to show 
that there exists a nonzero x in (Im(J — A))+ but not in Ker(I — A); 
that is, (Im(I — A))+ is not contained in Ker(I — A). 

Notice that (Theorem 1.5) 


dim Im(J — A) + dim Ker(J — A) =n 
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and that 


C” = Im(I — A) @ (Im(J — A))+. 


Now if (Im(J — A))+ is contained in Ker(J — A), then they must be 
equal, for they have the same dimension: 


dim(Im(J — A))+ = n— dim Im(I — A) = dim Ker(J — A). 


It follows, by (1.11) in Section 1.4 of Chapter 1, that 


Im(I — A) = Im(I — A*). 


Thus, J— A=I— A* and A is Hermitian. Then (5) follows easily. 
(5)=(1): If A = A*A, then A is obviously Hermitian. Thus, 


Problems 


1. 


A=A*A=AA=A’. B 


Characterize all 2 x 2 idempotent, nilpotent, and involutary matrices 
up to similarity. 


Can a nonzero matrix be both idempotent and nilpotent? Why? 


. What is the characteristic polynomial of a nilpotent matrix? 
. What idempotent matrices are nonsingular? 


. Show that if A €¢ M,, is idempotent, then so is P~'AP for any 


invertible P € M,. 


. Show that the rank of an idempotent matrix is equal to the number 


of nonzero eigenvalues of the matrix. 


. Let A and B be idempotent matrices of the same size. Find the nec- 


essary and sufficient conditions for A+ B to be idempotent. Discuss 
the analogue for A— B. 


. Show that s(I + A) is idempotent if and only if A is an involution. 


. Let A be a square complex matrix. Show that 


A? =A + rank(A) = tr(A) and rank (I — A) = tr(I— A). 


If A? = —A, what rank conditions on A does one get? 


Sec. 5.1 Idempotence, Nilpotence, Involution, and Projections 131 


10. 


Ld 


12. 


13. 


14. 


15. 


16. 
17. 
18. 
19. 


20. 


21. 


Let A be an idempotent matrix. Show that 
A=A* Ss ImA=Im4*. 


Show that a Hermitian idempotent matrix is positive semidefinite 
and that the matrix M is positive semidefinite, where 


1 
yy", ye". 


* 


M=I1- 


Show that 7? = 0, where 7 is a transformation defined on Mz by 


T(X)=TX—XT, X€Mb, 


r=(23). 


Find the Jordan form of a matrix representation of 7. 


with 


Let A € M,,. Define a linear transformation on M,, by 
T(X)=AX—-XA, X EM). 

Show that if A is nilpotent then 7 is nilpotent and that if A is 

diagonalizable then the matrix representation of J is diagonalizable. 


Let A and B be square matrices of the same size. If A is nilpotent 
and AB = BA, show that AB is nilpotent. Is the converse true? 


Let A be an n-square nonsingular matrix. If X is a matrix such that 
AXA*=X, EC, 


show that |A| = 1 or X is nilpotent. 

Give a 2 x 2 matrix such that A? =I but A*A A I. 

Let A? = A. Show that (A+ J)* =1 + (2* —1)A for k =1,2,.... 
Find a matrix that is a projection but not an orthogonal projection. 


Let A and B be square matrices of the same size. If AB = A and 
BA = B, show that A and B are projection matrices. 


Let A be a projection matrix. Show that A is Hermitian if and only 
if Im A and Ker A are orthogonal; that is, Im AL Ker A. 


Prove Theorem 5.3 along the line: (1)(2)>(3)>(4)>}(5)>(2). 
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22. 


23. 


24. 


25. 
26. 
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Show that A is an orthogonal projection matrix if and only if A = 
B*B for some matrix B with BB* =I. 


Let Aj,...,Am be n x n idempotent matrices. If 
Ay t::-+Am =In, 


show that 
AjAj =0, iF jf. 


(Hint: Show that C” = Im A; @--: @Im4A,, by using trace.] 
If W is a subspace of an inner product space V, one may write 
V=Wwew+ 
and define a transformation A on V by 
A(v)=u, ifu=wtwt,wew, wtew!t, 
where w is called the projection of v on W. Show that 


a) A = a linear transformation. 
(b) A 

c) I ait es W and Ker(A) = W-. 

(d) |lv - A(e)|| < lv — A(w)]| for any ue V. 

e) Every v € V has a unique projection w € W. 
£) |lu||? = lll]? + |w*|l?. 


When does equality in Theorem 5.3(3) hold? 


Let A and B be m x n complex matrices of rank n, n < m. Show 
that the matrix A(A*A)~!A* is idempotent and that 


ALAA = Bese S&S AHBx 
for some nonsingular matrix X. [Hint: Multiply by A.] 
Let A and B be orthogonal projections of the same size. Show that 


(a) A+ B is an orthogonal projection if and only if AB = BA = 0. 
(b) A-—B is an orthogonal projection if and only if AB = BA = B. 
(c) AB is an orthogonal projection if and only if AB = BA. 
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5.2 Tridiagonal Matrices 


One of the frequently used techniques in determinant computation is 
recursion. We illustrate this method by computing the determinant 
of a tridiagonal matrix and go on studying the eigenvalues of matrices 
of this kind. 

An n-square tridiagonal matrix is a matrix with entries t;; = 0 
whenever |i — j| > 1. The determinant of a tridiagonal matrix can 
be calculated inductively. For simplicity, we consider the special 
tridiagonal matrix 


a b 0 
ca b 
c a 6b 
T= (5.3) 
c a ob 
0 c a 


Theorem 5.5 Let T,, be defined as in (5.3). Then 


a” if bc = 0, 
det Tn = 4 (n+ 1)(a/2)” if a? = Abc, 
(rk Br io 8) aper Abc, 
where 
yn Ate — Abe p20 
on is 


Proof. Expand the determinant along the first row of the matrix in 
(5.3) to obtain the recursive formula 


det T;,, = adet T;,_1 — be det Ty_2. (5.4) 


If bc = 0, then b = 0 or c= 0, and from (5.3) obviously det T,, = a”. 
If bc 4 0, let a and £ be the solutions to x? — az + bc = 0. Then 


a+P=a, af = be. 
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Note that 
a* — 4bc = (a — 8). 


From the recursive formula (5.4), we have 


det T,, — adet T,-1 = B(det T,-1 — adet T,_2) 


— det T,, — 6 det T,-1 = a(det T,_1 — 6 det T,_-2). 
Denote 

fn = det T, — adetTpn-1, gn = det T,, — Bdet Tr_-1. 
Then 


In = Bfn—-1, Gn = Gn-1, 

with (by a simple computation) 

fo=B, g2=a. 
Thus, 

tn = 8B", Gn ="; 
that is, 

det T;, — adetT,1 = 6", detT,, — Bdet T,_; =a”. (5.5) 

It follows, using T,,,1 in (5.5) and subtracting the equations, that 
antl _ gntl 


= B y 


If a = £, one can easily prove by induction that 


det T, = (n+ n($) : 


det Ty = ifa# 8. 


Note that the recursive formula (5.4) in the proof depends not on 
the single values of b and c but on the product bc. Thus, if a € R, 


be > 0, we may replace b and c by d and d, respectively, where 
dd = bc, to get a tridiagonal Hermitian matrix H,,, for which 


det(AI — Ty) = det(AI — Hp). 


It follows that T;, has only real eigenvalues because H,, does. In fact, 
when a, b, c € R, and bc > 0, matrix DT,.D~! is real symmetric, 
where D is the diagonal matrix diag(1,e,...,e"~1) with e = ./b/c. 


Sec. 5.2 Tridiagonal Matrices 135 


Theorem 5.6 If T,, is a tridiagonal matrix defined as in (5.3) with 
a,b,c € R and bc > 0, then the eigenvalues of T, are all real and 
have eigenspaces of dimension one. 


Proof. The first half follows from the argument prior to the theorem. 
For the second part, it is sufficient to prove that each eigenvalue has 
only one eigenvector up to a factor. 


Let x = (21,...,2n)" be an eigenvector of T, corresponding to 
the eigenvalue A. Then 


(Al=,2=0, #20, 
or equivalently 


(A—a)az,— bag = 0, 
—cr, + (A—a)rg—br3 = 0, 


—C¥n—-2 + (A — a)dn_-1 — bey, 
—C¥n_-1 + (A—a)an 


ll 
=e) 


Because b € 0, x2 is determined by 2x, in the first equation, so are 


13,-..,Xn successively by x79, x3, and so on in the equations 2, 3, 
.., m—1. If x, is replaced by kax1, then xr2,73,...,2%, become 
kao, ka3,...,kxa,, and the eigenvector is unique up to a factor. Wf 


Note that the theorem is in fact true for a general tridiagonal 
matrix when a; is real and b;c; > 0 for each 7. 


Problems 


1. Compute the determinant 


oo 8 
one 
a 


2. Carry out in detail the proof that T), is similar to a real symmetric 
matrix if a, b, c€ Rand be > 0. 
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3. Compute the n x n determinant 


ewe 
i 


Chap. 5 
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7. Find the inverse of the n x n matrix 


a 2 0 
La 
a, 
= a 2 
zx 
t ae 
a! 
0 xz «(CG 


9. (Cauchy matrix) Let \1,...,An be positive numbers and let 


1 
A= ' 


Tis; Ae — 3)? 

Ii, 5 Oi + Xj ) 
[Hint: Subtract the last column from each of the other columns, then 
factor; do the same thing for rows; use induction.] 


Show that 
det A = 


10. Show that 


i 1 1 
2 3 n+l 
i 1 1 
3 4 n+2 
>0 
1 1 1 
n+1 n+2 2n 
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5.3 Circulant Matrices 
An n-square circulant matriz is a matrix of the form 
Co Cl c2 Cn—-1 
Cn-1 Co Cl Cn—2 
Cn—2 €n-1 © Cn—3 (5.6) 
Cy C2 C3 Co 
where Co, C1,---;Cn—1 are complex numbers. For instance, 
1 2 3 n 
n 1 2 n—-1l 
N= a : 
3.4 5 2 
23 4 1 
and 
0 1 0 0 
001 0 
ve PO 8 : (5.7) 
0 0 0 1 
1 0 0 0 


are circulant matrices. Note that P is also a permutation matrix. 


We refer to this P as the n x n primary permutation matrix. 


This section deals with the basic properties of circulant matrices. 
The following theorem may be shown by a direct verification. 


Theorem 5.7 An n-square matrix C is circulant if and only if 


where P is then x n primary permutation matrix. 


O=PCP’, 
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We call a complex number w an nth primitive root of unity if 
w” —1=0 and w* — 1 £0 for every positive integer k <n. 

Note that if w is an nth primitive root of unity, then w” is a 
solution to #” -1=0,0<k <n. It follows by factoring «” — 1 that 


k 


Theorem 5.8 Let C be a circulant matrix in the form (5.6), and let 
fA) = co tc rAt-::+en_1A” |. Then 

C = f(P), where P is then x n primary permutation matric. 
C is a normal matrix; that is, C*C = CC®. 

. The eigenvalues of C are f(w*), k=0,1,...,n—1. 

. det C = f(w°) fw"). fr"). 


E*CF is a diagonal matrix, where F is the unitary matrix with 
the (i, 7)-entry equal to que DOr), 4,9 = lyeaes He 


ow HR fe doo 


Proof. (1) is easy to see by a direct computation. (2) is due to the 

fact that if matrices A and B commute, so do p(A) and q(B), where 

p and q are any polynomials (Problem 4). Note that PP* = P*P. 
For (3) and (4), the characteristic polynomial of P is 


n-1 


det(AI — P) =" -1= [[(Q-o%). 
k=0 


Thus, the eigenvalues of P and P* are, respectively, w* and w**, 
k =0,1,...,n—1. It follows that the eigenvalues of C = f(P) are 
f(w*), k =0,1,...,n —1 (Problem 7, Section 3.2), and that 


n-1 


detC = [J f@*). 


k=0 
To show (5), for each k = 0,1,...,n—1, let 


k 2k —1)ky\T 
ee = CL, a", te pag toe Jk) ; 


140 Special Types of Matrices Chap. 5 


Then 


Pap = (w*, w*, 22, wD, 1)7 = why 


and 
Cap, = f(P) ae = f(w")ae. 


In other words, x, are the eigenvectors of P and C corresponding to 
the eigenvalues w* and f(w*), respectively, k = 0,1,...,n—1. 
However, because 


n-1 n-1 0 ifxj 
ow.) — Qwik wit = Vo wD ’ ’ 
en) = Saat a Soha Ted 


k=0 


we have that 


1 il 1 
{yet vn’ oe) ain} 


is an orthonormal basis for C”. Thus, we get a unitary matrix 


1 1 1 1 
; WwW ww wrt 
2 4 2(n—1) 
F=— W W Ww 
Ja . 
1 wri yy2(n—1) yy(n—1)(n—-1) 


such that 
F*CF = diag(fw®), f(w'),-.., fw" *)). 
That F' is a unitary matrix is verified by a direct computation. I 


Note that F’, called a Fourier matriz, is independent of C. 


Problems 


1. Let w be an nth primitive root of unity. Show that 


(a) wo = 1. 
a 
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10. 


Lek: 


. Let w be an nth primitive root of unity. Show that w* is also an nth 


primitive root of unity if and only if (n,k) = 1; that is, n and k have 
no common positive divisors other than 1. 


. Show that if A is a circulant matrix, then so are A*, A*, and A! if 


the inverse exists. 


. Let A and B be square matrices of the same size. If AB = BA, show 


that p(A)q(B) = q(B)p(A) for any polynomials p and gq. 


. Let A and B be circulant matrices of the same size. Show that A 


and B commute and that AB is a circulant matrix. 


. Let A be a circulant matrix. Show that for every positive integer k 


rank (A*) = rank (A). 


. Find the eigenvalues of the circulant matrices: 


1 
2 and Ww Ww , w=, 
1 1 


. Find the eigenvalues and the eigenvectors of the circulant matrix 


0 1 0 0 


Find the matrix F’ that diagonalizes the above matrix. 


. Let P be the n x n primary permutation matrix. Show that 


Pr= I, pt = Pt = prt. 


Find a matrix X such that 


Co Cy C2 Co Cy C2 
Cy CQ Co =X C Co Cy 
c% C C1 C1 C2 


0 0 0 1 0 1 0 0 

»)| 1 0 0 0 _{ 0 0 1 0 
Q 0 1 0 0 oa 0 0 0 1 
0 0 1 0 1 0 0 O 
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13. 


14. 
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Let F be the n x n Fourier matrix. Show that 


a) F is symmetric; namely, F7 = F. 


(F*)3 = F and F4 =I. 

The eigenvalues of F’ are +1 and +7 with appropriate multi- 

plicity (which is the number of times the eigenvalue repeats). 

e) F* =n-?V(1,0,0",...,@"—1), where V stands for the Van- 
dermonde matrix (see the next section). 

f) If F = R+iS, where R and S are real, then R? + S? = J, 

RS = SR, and R and S are symmetric. 


) 

(b) (F*)? = F? is a permutation matrix. 
) 
) 


(d 


Let e; be the column vectors of n components with the 7th component 
1 and 0 elsewhere, i = 1,2,...,, C1, €2,---,Cn € C, and let 


A = diag(c1,¢2,..-,¢n)P, 
where P is the n x n primary permutation matrix. Show that 
(a) Ae; = cj-1e;-1 for each 1 = 1,2,...,n, where co = Cn, €0 = En. 
(b) A” = cI, where c = c1C2°++ Cp. 
(c) det(I+A+---+ A") =(1—c)™1. 


A matrix A is called a Toeplitz matrix if all entries of A are constant 
down the diagonals parallel to the main diagonal. In symbols, 


ao ay a2 ee an 
a_i ao ay net An-1 
A= a_92 (0 | ag 
ay 
Gm G-nt+1 ‘** G1 ao 


For example, matrix F = (fj;) with fiig1. = 1,7 = 1,2,...,n—1, 
and 0 elsewhere, is a Toeplitz matrix. Show that (i) a matrix A is a 
Toeplitz matrix if and only if A can be written in the form 


n n 
A=) ap(FT)F +50 apF*, 
k=1 k=0 
(ii) the sum of two Toeplitz matrices is a Toeplitz matrix, (iii) a 


circulant matrix is a Toeplitz matrix, and (iv) BA is a symmetric 
matrix, known as a Hankel matrix, where B is the backward identity. 


© 
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5.4 Vandermonde Matrices 


An n-square Vandermonde matriz is a matrix of the form 


1 1 1 vee 1 
aL a2 a3 an 
a a a a2 


denoted by V;,(@1, @2,..-,@n) or simply V. 

Vandermonde matrices play a role in many places such as in- 
terpolation problems and solving systems of linear equations. We 
consider the determinant and the inverse of a Vandermonde matrix 
in this section. 


Theorem 5.9 Let V;,(a1, a2,...,@n) be a Vandermonde matrix. Then 
det V;,(a1,@2,...,@n) = II (aga), 
1<i<j<n 
and V;,(a1, 2,.-.,@n) ts invertible if and only if all the a; are distinct. 


Proof. We proceed with the proof by induction. There is nothing to 
show if n = 1 or 2. Let n > 3. 

Suppose the assertion is true when the size of the matrix is n—1. 
For the case of n, subtracting row 7 multiplied by a; from row i+ 1, 
for 7 going down from n — 1 to 1, we have 


1 1 1: 1 
0 a2 — a4 a3—a, oc: An — ay 
detV = 0 a2(az = a1) a3(a3 = a1) — An(An = a1) 
0 a}-7(ag—a1) a}-*(a3—ay) +++ at? (an — a1) 
a2— a4 a3—a, -: An — a4 
a2(a2 — ay) a3(a3 — ay) eee An (An — a1) 
ar (ag — a4) ay * (a3 —ay) +++ a®—?(an — a1) 
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(a; — a1) det V,_1(a0, a3,..., an) 


I 


& 
Il 
i) 


] 
12 


(a7 — a1) II (a; — a;) (by the hypothesis) 


j2 2<i<j<n 
= ff (@-a). 
1<i<j<n 


It is readily seen that the Vandermonde matrix is singular if and only 
if at least two of the a; are equal. Hl 
An interesting application follows: Let A € M,,. Then 


A°=0 © trA*=0, k=1,2,...,n. 


Because A” = 0, A is nilpotent, thus, A has only zero eigenvalues; 
so does A* for each k. For the other way around, let the eigenvalues 
of A be Ay, A2,..., An. Then the trace identities imply 


Ay t+ Ag +++: +An = 0, 
AG HAZ +--+ +A2 =0, 


AT AG ee Ay = 0; 
rewritten as 
Vel Aig Agro ng oy) Aig tty ed's An) =0. 


If all of the A; are distinct, then by the preceding theorem the Vander- 
monde matrix is nonsingular and the system of equations in 1, Ag, 
...,An has only the trivial solution Ay = Ag =--: = A, = 0. If some 
of the A; are identical, for instance, Ay = A2g and A2, A3,...,An are 
distinct, we then write the system as 


Vn—1(r2, a ees An)(2A2, es Xn)” = 0. 


A similar argument will result in Ag = --- = A, = 0. 
This idea applies to the interpolation problem of finding a poly- 
nomial f(x) of degree at most n — 1 satisfying 


f (zi) = yi, #=1,2,...,n, 


where the 2; and y; are given constants (Problem 4). 
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Theorem 5.10 For any integers ky < ky < +--+ < kn, the quotient 


det Va(ki, ko, ..., Kn) 
det V,(1,2,...57) 


is an integer. 


Proof. Let f; be any monic polynomial of degree i for 7 = 1,2,...,n— 
1. The additive property of determinants (see Section 1.2) shows that 


iL 1 il vee 1 
fii) filke) — fi(ks) +++ fh) 
fo(ki) fake) falks) +--+ falRn) (5.8) 


fn—1(k1) fn—1(k2) faiths) tis tua (he) 
is the same as det V,,(k1, ko,...,kn). By taking, for any integer a, 


fila) = ala 1(a=2)- (a1 +1) =2(4), 


we see that f;(a) is divisible by (¢ — 1)!. 

Factoring out (i — 1)! from row i, i = 2,3,...,n, we see that the 
determinant in (5.8), thus det V,(k1, ke,...,kn), is divisible by the 
product []}_,(¢ — 1)!. 

The proof is complete, for []j"_,(¢ — 1)! = det V,,(1,2,...,n). I 

We now turn our attention to the inverse of a Vandermonde ma- 
trix. Consider the polynomial in x given by the product 

p(x) = (@ + a1)(@ + a2)--- (a+ an), 
where @1,a2,...,@n are constants. Expand p() as a polynomial 


p(x) = sox” + ya") + +++ + Sn12 + Sn, 


where so = 1 and for each k = 1,2,...,n, 


k 
Sp = Spl Wi 0d, <s2 fy) = > [[ .: 


1<pi<--<ppsn gal 


We refer to sz, depending on @1,@2,...,@n, as the kth elementary 
symmetric function of a1, a2,...,@n. (See also Section 4.4.) 
One may expand p(x) = (x + a1)(x + a2)(x + ag) as an example. 
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Theorem 5.11 Suppose that a,,a2,...,@n are distinct. Then 
(Vn(ai, a2,.-. ae = (ats) 


where for each pair of i and j 


149 -j 
a ae ee N=, pa#i “Pa 


Le 
° Tk=1, kei (Ak — aj) 


Proof. Recall from elementary linear algebra (Section 1.2) that the 
entries of the inverse of the matrix V are the cofactors of order n— 1 
divided by det V; that is, 


vig 1 - . 
~ \detVis) ? 


where c;; is the cofactor of the (i, j)-entry of V. 

In what follows we compute the cofactors c;;. Let Vz be the 
matrix obtained from V by deleting row k + 1 (the kth powers) and 
adjoining as a new nth row the nth powers of the a;. We show 


det Vp = Sn—p det V. (5.9) 


Augment V with the nth powers of the a; as the (n + 1)th row 
and with (1, —2, (—x)*, ..., (—x)")? as the first column. Denote 
the resulting matrix by W. Then W is a Vandermonde matrix and 


detW = (#+41)--:(#@+a,)detV 
(2 + yr"-1+---4+5, 1¢-+8,)detV. (5.10) 


Expanding det W along the first column, we have 
det W = det Vo + edet Vi + +--+ a" det V. (5.11) 
Identity (5.9) follows by comparing (5.10) and (5.11). 


Now notice that each cofactor c;; is a determinant of order n—1 in 
the same form as det Vz. Let V(aj) and s;,(a@j) denote, respectively, 
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the (n—1)-square Vandermonde matrix and the kth elementary sym- 
metric function of a1, a2,...,@n without a;. Using (5.9) we have 


Cij = (-1)'¥ det V (alg) 
= (-1)°% 8(n—-1)-G-1)(G) det V (aj) 
= (-1)'%s,_4(G) det V(4;). 


Thus, 


1 (-1)'4s,-i(G) det V(G) 
[lis (a _ Gs) 
(—1)**9 5n_4(G) 
LHe<j (a; — as) [jes (ae — a5) 
(—1)**1sn_i(G5) 
Timi, 64j (Ge — 25) . 
ci Se IS, PqF) pg 


i=, hej (Ok — 45) 


or 


14j n-Jj 
(—1) ve pee T1,=%, pq#i “Pa 


a 
That, ei (Gk — aj) 


aij = 


Problems 


1. Find the solution to the equation in a: 


2. Evaluate the determinant 
1 az a?+2? 
1 ay a?+y? 
1 az at4+2? 
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. Find all solutions to the system of equations in 71,...,%n, 


Lyte +Iin=a 
2 2 2 
Lite +e, =a 


apts +ar=a”. 


Let 71, %2,...,%, be different numbers. Show that for any set of n 
numbers 41, 2,---;Yn; there exists a polynomial f(x) of degree at 
most nm — 1 such that f(a;) = y;, i = 1,2,...,n. In particular, for 
any numbers Aj, Az, ..-, An, there exist polynomials g(x), and h(x) 
if each A; > 0, of degree at most n — 1 such that 


. Let A = diag(A1, A2,...,An), where A; 4 A; for i A 7. Show that for 


every normal matrix B € M,, there exist a unitary matrix U and a 
polynomial f such that B = U* f(A)U. 


. Let U = (ui;) be the p-square unitary matrix with 


Uig a j= 1,2,...,D, 


where p is a prime integer and w is a pth primitive root of unity. 
Show that all square submatrices of U are nonsingular. 


Let S; = {a;}t_, and S: = {6;}%_, be nonzero complex number 
multisets (i.e., repetition of elements is allowed. Say, {1,2,2,3}). If 


y di s 
S- ar a > Be, for every positive integer k <r+s, 
i=1 i=1 


show that r = s and S; = 59; that is, the two sets are the same. 


. Show that two n-square complex matrices A and B have the same 


set of eigenvalues if and only if tr A* = tr B¥, k = 1,2,...,n. 


. Find the inverse, if it exists, of the Vandermonde matrix 


oe 
1 oav sy 
1 ge y? 


Expand and find elementary symmetric functions for 


p(x) = (a + ay) (a + ag)(x + ag)(x + a4). 
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11. 


12. 


13. 


14. 


15. 


16. 


Let W be the matrix obtained from V = V,,(a1,...,@,) by replac- 
ing the last row (a7~',...,a%—!) with (a?,...,a™). Show directly 


without using (5.9) that det W = (a; +---+ ay) det V. 


Let fi(x), fo(x), ..., fn(a) be polynomials with degree at most 
n — 2. Show that for any numbers ay, a2, ..., An 


filair) filaz) +++ fi(an) 


fo(a1) fe(az) +++ fe(an) 
; : ‘ . =0. 


falar) falaz) --+ falda) 


Let A = (a;;) € M, and f(x) = ay + agv +--+ + a,;2"~" for 
i=1,2,...,n. Show that 


fi(ti) falte) +++ filen) 
os i 7 oe = A A Ul eae: 


: : : : ip7 a 
fn(t1)  fn(w2) +++ fn(@n) 
Let a1, 42,...,@,, be complex numbers. Show that 
dl 1 
ay An 
: : = (—1)""! det V;,(a1,..., Gn): 
ay an 
a2a3 eee An eee ay,ag a a An—-1 
Let V = V,,(a1,..-,2n) be the n x n Vandermonde matrix of 7, ..., 


En, where x; # x; whenever i # j. Define F(x) = JJj_,(@ — 2%) 
and f,(z) = F(«)/(x, — x). Show that f,(z;) = 0 if 7 # k and 
flay) = —F'(a,), where F” is the 1st derivative of F. Expand 
—f(a)/F’(a,) and form a matrix M by its coefficients as the kth 
row of M,k=1,...,n. Show that M is the inverse of V. 


Let A be an n x n matrix with eigenvalues A1,...,An- Show that 
det(AT — A) =X" — aA"! + ag”? — + + (—1)" on, 


where ox, = 8%(A1,A2,---,An), k = 1,2,...,n. Describe o;, in terms 
of principal minors (see Problem 19, Section 1.3.) 


© 
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5.5 Hadamard Matrices 


An n-square matrix A is called a Hadamard matrix if each entry of 
A is 1 or —1 and if the rows or columns of A are orthogonal; that is, 


AAT=nI or ATA=nil. 


Note that AA? = nI and A7A = nl are equivalent (Problem 5). 
The following are two examples of Hadamard matrices: 


1 1 1 1 

1 1 1 1 -1 -1l 

( _— ) , a en Co 

1 -l 1 -l 
Notice that if A is a Hadamard matrix, then so is AP for any 
matrix P with entries +1 satisfying PP? = I. Thus, one may change 
the —1 in the first row of A to +1 by multiplying an appropriate 


matrix P with diagonal entries +1. There is only one 2x 2 Hadamard 
matrix of this kind. Can one construct a 3 x 3 Hadamard matrix? 


Theorem 5.12 Let n > 2. A necessary condition for an n-square 
matriz A to be a Hadamard matrix is that n is a multiple of 4. 


Proof 1. Let A = (a;;) be an n-square Hadamard matrix. Noticing 
that the entries of A are +1, the equation AA? = nI yields 


n 

0, ift Fy, 
Yraxan={ ied 
k=1 : 


Upon computation, we have 


n 


n nm 
2 
y (a1k + 2%) (Qin + azn) = y ay, + y A1kA2k 
k=1 k=1 k=1 
n n 
a y Q16.43k + S A2k43k 
k=1 k=1 


n 
_ 2 
= So ain 
k=1 


= n. 
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Observe that the possible values for a1, +a2, and a1, +a3, are +2, 0, 
and —2. Thus, each term in the summation 


n 


So (au + azz) (G1k + a3%) 
k=1 


must be +4, 0, or —4. It follows that n is divisible by 4. 


Proof 2. Let P be an n-square matrix with main diagonal entries 
1 or —1 such that the first row of AP consists entirely of +1. Note 
that AP is also a Hadamard matrix. Since the second and third rows 
of AP are orthogonal to the first row, they must each have the same 
number, say r, of +1s and —1s. Thus n = 2r is an even number. 
Let nt be the number of columns of AP that contain a +1 of 
row 2 and a —1 of row 3. Similarly, define nj, ils and n_. Then 


n +nt=nt4+n [=n +n ty i 


Thus, 


i 
Nye=n_, NL=N,. 


The orthogonality of rows 2 and 3 implies that 


nytn_=n, a ae 


This gives nt =n. Therefore, n = 2r = 4nt is a multiple of 4. 


It has been conjectured that a Hadamard matrix of size 4k x 4k 
exists for every positive integer k. The conjecture is not resolved yet. 

The following theorem, verified by a direct computation, gives a 
way to construct Hadamard matrices of larger dimensions. 


Theorem 5.13 Jf A is a Hadamard matrix, then so is 


( a ) . (5.12) 


By this theorem, Hadamard matrices H,, of order 2” can be gen- 
erated recursively by defining 


1 1 Ay-1 Hy-1 
= = > 2. ? 
Hy & a Hn es ls n>2. (5.13) 
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Let 


n=(1) and on = ( ae! where c= —1+ V2. 


By a simple computation, H, has two eigenvalues +V/2, and 2 is an 
eigenvector corresponding to V2. This generalizes as follows. 


Theorem 5.14 Let H,, be defined as in (5.13). Then Hy, has eigen- 
values +2”/2 and —2"/2 each of multiplicity 2"-!, and an eigenvector 
tn corresponding to the positive eigenvalue 2”/?. 


Proof. The proof is done by induction on n. The case of n = 1 was 
discussed just prior to the theorem. Now for n > 2, we have 


det(AI — H,,) 


AI — An-1 —~LtIn-1 
~4in-1 AL + Ani 


= det ((Z =#,)07 82.i)= H2_,) 
det(\?I — 2H?_,) 
= det(AI — V2Hn_1) det(AI + V2Hp_1). 


Thus each eigenvalue yp of H,_1 generates two eigenvalues +/2u of 

H,,. The assertion then follows by the induction hypothesis, for Hy,_1 

has eigenvalues +2(°—))/2 and —2("-)/2 each of multiplicity 2"~2. 
To see the eigenvector part, we observe that, by induction again, 


i ee = An-1 An-1 In-1 
oe gt. =Heai (-14+ V2 )an—1 
( V2Hn-12n-1 ) 
(2 = V2 Ap aGani 


gn/2 ( Tn-1 ) 
(-14+ V2 )an-1 
= Qr/rm | 
Let J, denote the n-square matrix whose entries are all equal to 


1. We give a lower bound for the size of a Hadamard matrix that 
contains a Jy, as a submatrix. 
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Theorem 5.15 /f A is an m-square Hadamard matrix that contains 
a Jn as a submatrix, then m > n?. 


Proof. We may assume by permutation that A is partitioned as 


Ae ae (5.14) 


where Z, is an s-square matrix of entries +1, and s = m-—n. 
Since A is a Hadamard matrix of size m = n+ s, we have 


AA™ = (n+ s)In, 
which implies, by using the block form (5.14) of A, that 
ie KX” =(n+o)ly. 


Thus, 
XX? =(n+s)In —nJn. (5.15) 


The eigenvalues of the right-hand matrix in (5.15) are 
mbs=H; WAS, sie TUARSs 
However, X X7 is positive semidefinite, and thus has nonnegative 


eigenvalues. Therefore, n+s—n?>0Oorm>n?. W 


Problems 


1. Show that A is a Hadamard matrix, and then find A*, where 
1 1 -1 
a 1 -1 1 
o f=) G@ 4 
-1 1 1 1 
2. Does there exist a 3 x 3 Hadamard matrix? How about a 6 x 6? 
Construct an 8 x 8 Hadamard matrix. 


3. What is the determinant of an n-square Hadamard matrix? 


4. Let A = (aj;) be a 3 x 3 matrix. Show that if each a;; = 1 or —1, 
then det A is an even number. What is the maximum for det A? 
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. Let A be an n-square matrix with entries +1. Show that 


AAT =nI 8 ATA=EnI. 


Conclude that if A is Hadamard, then TA is orthogonal. 


. Find the eigenvalues and eigenvectors of the Hadamard matrix 


1 1 1 1 
1. “= 1 <1 
1 1 -1 -1 
1 = 1 


. Show that the Kronecker product of two Hadamard matrices is also 


a Hadamard matrix. 


. Let n > 2 and define recursively, as in (5.13), 


= 1 1 _ Ay -4 Ay, 4 
m=(4 As ce a ) 


1 1 
Fy =5(1+ 2-7/7), Fo = =a0= Q-"/? H,,). 


and let 


Show that F, and F> are idempotent matrices and that 


Fi+h=2-™?H,. 


. Let n > 2 and define recursively 


0 -1 0 -E, 
m=(9 he Fou ( 3, 0 ): 
Show that 
(a) E2 = (-1)"Ion. 
(b) E,, is symmetric if n is even, and skew-symmetric if n is odd. 
(c) FE, H, = (—1)"H,,E,, where H,, is defined as in (5.13). 


Let A be a square matrix with entries 1, —1, or 0. If each row and 
column of A contains only one nonzero entry 1 or —1, show that 
A* = I for some positive integer k. 


How many n x n matrices of 0 and 1 entries are there for which the 
number of 1s in each row and column is even? [Answer: 2°°~)(r—}) ] 


© 
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5.6 Permutation and Doubly Stochastic Matrices 


A square matrix is called a permutation matrix if each row and col- 
umn of the matrix has exactly one 1 and all other entries are 0. 

It is easy to see that there are n! permutation matrices of size n. 
Furthermore, the product of two permutation matrices of the same 
size is a permutation matrix, and if P is a permutation matrix, then 
P is invertible, and P~! = P™ (Problem 1). 

Our goal in this section is to show that every permutation matrix 
is a direct sum of primary permutation matrices under permutation 
similarity and that every doubly stochastic matrix is a convex com- 
bination of permutation matrices. 

A matrix A of order n is said to be reducible if there exists a 
permutation matrix P such that 


gos Be OC 
es ae (5.16) 


where B and D are square matrices of order at least 1. 

A matrix is said to be irreducible if it is not reducible. Note 
that a matrix of order 1 is considered to be irreducible. The matrix 
P? AP = P~'AP in (5.16) is similar to A through the permutation 
matrix P. We say that they are permutation similar. 

It is obvious that the diagonal entries of irreducible permutation 
matrices are all equal to 0, but not vice versa. For example, 


0 1 0 0 
1 0 0 0 
00 0 1 
0 0 1 0 


Theorem 5.16 Every reducible permutation matriz is permutation 
similar to a direct sum of irreducible permutation matrices. 


Proof. Let A be an n-square reducible permutation matrix, as in 
(5.16). The matrix C in this case must be zero, for otherwise, let B 
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be rx r and D be s x s, where r+s =n. Then B contains r 1s 
(in columns) and D contains s 1s (in rows). If C contained a 1, then 
A would have at least r+ s5+1= "+1 1s, a contradiction. The 
assertion then follows by the induction on Band D. U& 

We now show that every n-square irreducible permutation matrix 
is permutation similar to the n x n primary permutation matrix 


010... 0 
001... 0 

PS ee ee ais (5.17) 
000... 1 
10 0 0 


Theorem 5.17 A primary permutation matrix is irreducible. 


Proof. Suppose the nxn primary permutation matrix P is reducible. 
Let STPS = J, @---OJ,, k > 2, where S is some permutation matrix 
and the J; are irreducible matrices with order < n. 

The rank of P—TJ is n—1, for det(P — J) = 0 and the submatrix 
of size n — 1 by deleting the last row and the last column from P — I 
is nonsingular. It follows that 


rank (S7 PS — I) = rank ($7(P—JI)S) =n-1. 
By using the above decomposition, we have 


k 
rank ($7 PS — I) = So rank (Jj —I)<n-k<n-1l. 
i=1 
This is a contradiction. The proof is complete. I 
The eigenvalues of the n x n primary permutation matrix P are 


exactly all the roots of the equation \” = 1; that is, 1,w,...,w!, 


where w is an nth primitive root of unity, because 
det(AI — P) =X” -1 


by a direct computation. In addition, for any positive integer k <n, 


n-1 _ pT mh k 0 Ink 
pe-l_ pT) pry, oes ; i: 
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Theorem 5.18 A permutation matrix is irreducible if and only if it 
is permutation similar to a primary permutation matria. 


Proof. Let Q be an n x n permutation matrix and P the n x n 
primary permutation matrix in (5.17). If Q is permutation similar 
to P, then Q is irreducible by the previous theorem. Conversely, 
suppose that Q is irreducible. We show that Q can be brought to P 
through simultaneous row and column permutations. 

Let the 1 of the first row be in the position (1,i1). Then 7; 4 
1 since Q is irreducible. If i; = 2, we proceed to the next step, 
considering the 1 in the second row. Otherwise, 7; > 2. Permute 
columns 2 and 7; so that the 1 is placed in the (1,2) position. 

Permute rows 2 and 7; to get a matrix Q,. This matrix is per- 
mutation similar to Q and also irreducible. If the (2,3)-entry of Q 
is 1, we go on to the next step. Otherwise, let the (2, i2)-entry be 1, 
ig #3. If t2 = 1, then Q; would be reducible, for all entries in the 
first two columns but not in the first two rows equal 0. Thus, 72 > 3. 
Permute columns 3 and 2 so that the 1 is in the (2,3) position. 

Note that the 1 in the (1,2) position was not affected by the 
permutations in the second step. Continuing in this way, one obtains 
the permutation matrix P in the form of (5.17). The product of 
a sequence of permutation matrices is also a permutation matrix, 
therefore we have a permutation matrix S such that 


STQS=S'QS=P. 5 


Combining the above theorems, we see that every reducible per- 
mutation matrix is permutation similar to a direct sum of primary 
permutation matrices. Moreover, the rank of an n-square irreducible 
permutation matrix minus I is n — 1 (Problem 4). 


Theorem 5.19 Let Q be an n-square permutation matriz. Then Q 
is irreducible if and only if the eigenvalues of Q are 1, w,..., w"", 


where w is an nth primitive root of unity. 


Proof. If Q is irreducible, then @ is similar to the n x n primary 
permutation matrix, according to Theorem 5.18, which has the eigen- 


values 1, w, ..., w"~!; so does matrix Q. 
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Conversely, suppose that 1, w, ..., w"~! are the eigenvalues of 
Q. Note that w* 4 1 for any 1 < k <n since w is an nth primitive 
root of unity. If Q is reducible, then we may write 


STQOS=110---OJz, 


where S is a permutation matrix, and the J; are primary permutation 
matrices with order less than n. 

The eigenvalues of those J; are the eigenvalues of Q, none of 
which is an nth primitive root of unity, for the order of every J; is 
less than n. This is a contradiction. Thus, Q is irreducible. I 


We next present a beautiful relation between permutation matri- 
ces and doubly stochastic matrices, a type of matrices that plays an 
important role in statistics and in some other subjects. 

A square matrix is said to be doubly stochastic if all entries of the 
matrix are nonnegative and the sum of the entries in each row and 
each column equals 1. Equivalently, a matrix A with nonnegative 
entries is doubly stochastic if 


e’A=e" and Ae=e, where e=(1,1,...,1)7. (5.18) 


It is readily seen that permutation matrices are doubly stochastic 
and so is the product of two doubly stochastic matrices. 

We show that a matrix is a doubly stochastic matrix if and only 
if it is a convex combination of finite permutation matrices. To prove 
this, we need a result, which is of interest in its own right. 


Theorem 5.20 (Frobenius—K6nig) Let A be an n-square complex 
matrix. Then every product of n entries of A taken from distinct rows 
and columns equals 0; in symbols, 


ALi, A2ig*** Ani, =0, {t1,22,---,in} = {1,2,...,n}, (5.19) 
if and only if A contains anr xs zero submatrix, where r+s =n-+1. 


Proof. First notice that property (5.19) of A will remain true when 
row or column permutations are applied to A. In other words, an 
n-square matrix A has property (5.19) if and only if PAQ has the 
property, where P and Q are any n-square permutation matrices. 


Sec. 5.6 Permutation and Doubly Stochastic Matrices 159 


Sufficiency: We may assume by permutation that the r x s zero 
submatrix is in the lower-left corner, and write 


(25) 


Because n — r = s — 1, B is of size (s — 1) x s. Thus, there must 
be a zero among any s entries taken from the first s columns and 
any s different rows. Therefore, every product a1;, Qi, +++ @ni,, has to 
contain a zero factor, hence equals zero. 

Necessity: If all the entries of A are zero, there is nothing to prove. 
Suppose A has a nonzero entry and consider the submatrix obtained 
from A by deleting the row and the column that contain the nonzero 
entry. An application of induction on the (n—1) x (n—1) submatrix 
results in a zero submatrix of size px q, where p+ q = (n—1)+1=n. 
We thus may write A, by permutation, as 


BC 
4=(6 b): 


where B is q x q and D is p x p. Since every product of the entries 
of A from different rows and columns is 0, this property must be 
inherited by B or D, say B. Applying the induction to B, we see 
that B has an / x s zero submatrix such that 1+ 5s =q+1. Putting 
this zero submatrix in the lower-left corner of B, we see that A has 
an r X s zero submatrix, wherer=p+landr+s=n+l1. & 


Theorem 5.21 (Birkhoff) A matriz A is doubly stochastic if and 
only if it is a convex combination of permutation matrices. 


Proof. To show sufficiency, let A be a convex combination of permu- 
tation matrices P;, P,..., Pm; that is, 


A=tP, + toPo+-:-+tmPm, 


where ¢t1,t2,...,tm are nonnegative numbers of a sum equal to 1. 
Then it is easy to see that e7A = e7 and Ae = e, where e = 
(1,..., 1)". By (5.18) A is doubly stochastic. 

For necessity, we apply induction on the number of zero entries of 
the doubly stochastic matrices. If A has (at most) n? —n zeros, then 
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A is a permutation matrix, and we have nothing to show. Suppose 
that the doubly stochastic matrices with at least k zeros are convex 
combinations of permutation matrices. We show that the assertion 
holds for the doubly stochastic matrices with k — 1 zeros. 

Let A be an n-square doubly stochastic matrix of k — 1 zero 
entries. If every product of the entries of A from distinct rows and 
columns is zero, then A may be written as, up to permutation, 


BC 
4=-(5 >) 
where the zero submatrix is of size r x s withr+s=n+1. 

Since the entries in each column A add up to 1, the sum of all 
entries of B equals s. Similarly, by considering rows, the sum of all 
entries of D is r. Thus, the sum of all entries of A would be at least 
r+s=n-+1. This is impossible, for the sum of all entries of A is n. 
Therefore, some product 04;,142i5 +++ Gni, # 0. 


Let P,; be a permutation matrix with 1 in the positions (j,i;), 
j =1,2,...,n, and 0 elsewhere. Consider the matrix 


E =(1-6)(A-6P)), 


where 6 = min{ajj,, G2i., ---, Ani, }- 

It is readily seen by (5.18) that E is also a doubly stochastic 
matrix and that FE has at least one more zero than A. By the induc- 
tion hypothesis, there are positive numbers to,...,tm of sum 1, and 
permutation matrices P2,..., Pm, such that 


E = toPy+++++tmPm. 
It follows that 
A=6oP,+ (1 = 6) tz P» feet (1 ad Oe iis 


where P; are permutation matrices, and their coefficients are non- 
negative and sum up tol. & 
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Problems 


1, 


10. 


11. 


Show that the determinant of a permutation matrix is +1 and that 
permutation matrices are unitary and hence normal. 


Find permutations that bring the reducible permutation matrix 


to a direct sum of irreducible matrices. Show that P? = I. 


Let A= B= C als Show that A and B are both irreducible but 


AB, A?, and A® B are reducible. 


Let P be an n x n irreducible permutation matrix. Show that 
rank (P—JI)=n-1. 


Let P be an n X n irreducible permutation matrix. Show that P* is 
irreducible if and only if (n,k) = 1; that is, n and k have no common 
positive divisors other than 1. 


Let P be an n-square permutation matrix. Show that P is irreducible 
if and only if P has the property P™ =I, = n|m. 


Let P be an n X n irreducible permutation matrix. Show that P is 
diagonalizable over C but not over R when n > 3. 


Show that if two permutation matrices A and B are similar, i.e., 
S-!AS = B for some nonsingular S$, then they are permutation 
similar; that is, S can be chosen to be a permutation matrix. 


Show that for any n x n permutation matrix P, P”' = I, and further 
that if P is irreducible, then P” = I. Is the converse true? 


Prove or disprove that a symmetric permutation matrix (of odd or 
even order greater than 1) is reducible. 


Find the rank of the partitioned permutation matrix Q, where 


001 0 0 
0G OF 0 
@=|0 0007 
I 000 0 
0100 0 
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12. 


13. 


14. 


15. 


16. 


17. 


18. 


19. 
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Show that 
011i 
0 01 1 
0 0 0 1 
0 0 0 0 


is not permutation similar to its Jordan canonical form 


Show that an n-square matrix A with nonnegative entries is reducible 
if and only if there exists a proper subset {71,...,7,} of {1,...,n} 
such that 


Span{Ae;,,..., Ae:, } C Span{e;,,...,e:, }, 


where the e; are the standard basis vectors for C”. 


Show that the product of two doubly stochastic matrices is a doubly 
stochastic matrix. How about the sum? 


Show that if U = (u,;) is a unitary matrix, then A = (|u,;|?) is a 
doubly stochastic matrix. How about B = (|u,;||vi;|), where V = 
(v;;) is a unitary matrix of the same size as U? 


Let P and Q be permutation matrices of the same size. Show that 
aP +(1—a)Q 


is a doubly stochastic matrix for any a € [0, 1]. 


Show that the following determinant is zero: 


~OQ 7a ea 
SruNac 
sr OO0°0 
ov coo 
RBrOoOoO 


Show that every n x n doubly stochastic matrix is a convex combi- 
nation of at most n? — 2n +2 permutation matrices. 


If A is an n X n nonsingular matrix, how many zero entries can A 
have at most? 
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20. 


21. 


22. 


23. 


Show that every nilpotent matrix with nonnegative entries is per- 
mutation similar to a strictly upper-triangular matrix. [Hint: Let 
€1,..-,€n be the standard basis vectors of C”. Show Ae; = 0 for 
some i; that is, A contains a zero column, by induction] 


A permutation of an n-element set {1,2,...,n} is a mapping 


p:1 3, 24 tg, ..., NA Ip, 


_f{ 1 2 «on 
Oe ae te Hy 


Assign p a permutation matrix P = f(p), which has 1 in the (k, i,) 
position and 0 elsewhere, & = 1,2,...,n. Define the product of 


written as 


ft @ 25. # and aa (2 #2 0 in 
aa ae eros a o4 35. He By 
by 
( L 2 ) 
ve Ji J2 In 
Show that 


f (pa) = f(p) f(q)- 


A permutation is called an interchange if only two elements are per- 
muted. For instance, the following p is an interchange, where 


Show that every permutation on {1,2,...,n} can be obtained by a 
sequence of interchanges (permutation of two numbers each time). 


Show that any n x n permutation matrix can be expressed as the 
product of at most n — 1 symmetric permutation matrices. 
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5.7 Nonnegative Matrices 


A nonnegative matriz is a matrix all of whose entries are nonnegative. 
Permutation matrices and doubly stochastic matrices are nonnega- 
tive matrices. If A = (aij) is a nonnegative matrix, i.e., aj; > 0 for 
all i and 7, we write A > 0 (or 0 < A). If all the entries of A are 
positive, we call A a positive matrix and denote A > 0 (or 0 < A). 
Any matrix A (including row and column vectors) is associated 
with a nonnegative matrix (vector), written as |A|, whose entries are 
the absolute values of the entries of A; that is, | A] = ({aj;;|). 
Important note: Let A be a matrix (or a vector). In this section, 
and only in this section, of the book, A > (>)0 means that A is a 
nonnegative (positive) matrix and |A| stands for the resulting matrix 
by taking the absolute values of the entries of A. In all other sections 
of the book, A > 0 means that A is positive semidefinite and |A| is 
the modulus of A, i.e., |A| = (A*A)!/?. Each of these notations is 
about equally and widely used in the literature for both meanings. 
One may use A>,0 with a subscript e for an entrywise nonnegative 
matrix and |A|, = (|a;;|) for an entrywise absolute value matrix. 
For matrices A and B of the same size, we write 


A>(>)B if A-B>(>)0. 
Obviously, when A and B are matrices of the same size 
|A+ B| < |A]+|B 
and for matrices A and B of sizes m x n and n x m, respectively, 
|AB| < |AI|BI. 
In particular, for any n-square complex matrix A and vector x in C”, 
|Ax| < |Aller 
and for any square complex matrix A and positive integer k, 


|A*| < Al. 
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Furthermore, if A > 0 and x £0, then A|z| > 0. 

We are now ready to present a theorem on comparison of the 
spectral radii of nonnegative matrices. Recall that the spectral radius 
p(A) of an n-square complex matrix A is defined to be 


p(A) = max{|A|: A is an eigenvalue of A}. 


Note that p(A) is in general not an eigenvalue of A. Our main goal 
of this section is to show that if A > 0, then p(A) is an eigenvalue of 
A and that a positive eigenvector belonging to this eigenvalue exists. 


Theorem 5.22 Let A>0 and B> 0 be n-square matrices. Then 
A>B =  p(A) > p(B). 


Proof. Since A > B > 0, i.e., aj; > bj; > 0 for all i and j, we have 


a,j tJ 
It follows that 
1/2 1/ 
(Sa) = (D8) 
i) tJ 
or in Frobenius norm, 
\|Alle = ||Bllr- 


On the other hand (Problem 4), for all positive integers k, we have 
A>B>0 = A®>B*. 
This yields 
|A* lle > |B lle 
or : , 
1/k el 
JAAS > BALE 
Taking limits and using Theorem 4.4, we have p(A) > p(B). 


Consider a pair of n-square nonnegative matrices A and B. Since 
the Hadamard product Ao B is a principal submatrix of the Kro- 
necker product A ® B (Theorem 4.7), by Problem 11, we have 


p(Ac B) < p(A® B) = p(A)p(B). 
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The example below shows that p(AB) < p(A)p(B) is not true: 


4-(33). 2-(28) 
The following result ensures that p(AB) dominates p(Ao B). 
Theorem 5.23 Let A and B ben x n nonnegative matrices. Then 
p(AoB) < p(AB). 


Proof. The proof is based on two facts: If 0 < X < Y then 

p(X) < p(Y) (Theorem 5.22); and if P is a principal submatrix 

of a nonnegative square matrix Q then p(P) < p(Q) (Problem 11). 
We first show that 


(Ao B)(Bo A) < (AB) 0 (BA). (5.20) 


Computing the (i, 7)-entry of the right-hand side of (5.20), we have 
( 2 aisbsj) (~~ bisa; = os Qigbj4015 055. 
Ss t s,t 


Setting s=t =p yields )/,, aipbipapjbp;, the left-hand side of (5.20). 
Recalling that the Hadamard product is a principal submatrix of 
the Kronecker product (Theorem 4.7), we have 


(p(Ao B))? < p(ABo BA) < p(AB@ BA) = (p(AB))”. 


By taking the square roots, we obtain the desired inequality. 


The next result reveals lower and upper bounds for the spectral 
radius of a nonnegative matrix in terms of the entries, precisely, the 
row and column sums of the matrix. 


Theorem 5.24 Let A be an n-square nonnegative matrix. Then 


In other words, the spectral radius of a nonnegative square matrix is 
between the smallest row sum and the largest row sum. 
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Proof. We may assume A 4 0. Denote the ith row sum by 1;; 
namely, r; = aj, + ajo +--+: + ajn. Let r be the smallest row sum of 
A; that is, r = minr;. We construct a new matrix B such that 


0<B<A and r=o(B). 
If r= 0, set B = 0. Otherwise, let 
bij = r(rj aig): 


It is immediate that 0 < B < A. The preceding theorem ensures 
p(B) < p(A). To see p(B) = 1, first observe that Bep = reo, where 
eo = (1,..., 1)”. This says that r is an eigenvalue of B. So p(B) > r. 
However, considering the maximum row sum matrix norm |] - |l.o 
(Problem 8, Section 4.2), we have p(B) < ||Bll. =r. Thus, p(B) = 
r. The upper bound is similarly shown. I 

Similar results hold for the columns with A” in place of A. 

We now show a fundamental theorem on nonnegative matrices. 
We present the theorem for positive matrices. The results can be gen- 
eralized and amplified to nonnegative matrices (due to Frobenius). 


Theorem 5.25 (Perron) Let A be ann x n positive matrix. Then 


1. p(A) > 0. 

2. p(A) is an eigenvalue of A. 

3. Ax = p(A)x for some vector x > 0. 

4. If Au= p(A)u and Av = p(A)v, then u = av for some a e€ C. 
5. If X is an eigenvalue of A and X # p(A), then |A| < p(A). 


Proof. If A > 0, then each row sum (and column) sum is greater 
than zero. By the preceding theorem, (1) is true. 

To show (2) and (3), let Ax = Ax, where x # 0 and |A| = p(A). 
We show that A|x| = p(A)|z|. Denote p = p(A) for simplicity. Then 


plx| = |Alla] = |Ae] = [Aa] < |Alle] = Alay. 


Note that Alu| > 0 for all u 4 0 since every entry of A is positive. 
So A|z| > 0. Now set y = Alz| — p|z|. Then y > 0. If y £0, then 


0 < Ay = A(Ala|) — pla 
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or 
Az > pz, where z = Alz| > 0. 
If we write z = Alz| = (z1,...,2n), then all z; are positive and 
Az > pz implies that par ayjz; > pz for all 2. Let Z be the diagonal 
matrix diag(z1,...,2n). Using Theorem 5.24, we have 


1 
= =| er _ = 
p(A) = p(Z AZ) > min = : Qi;2; > p = p(A), 


a contradiction. So y = 0, i.e., Alz| = p|z|, and |x| > 0 as Alz| > 0. 
We have shown that for any square positive matrix A, 


Ag = Ac, £0, |A|=p(A) => Als|= pf{A)lz|, |z| > 0. (5.21) 


To show (4), following the above argument, we have A|z| = p|z| 
and |x| > 0 whenever Ax = px and x #0. Thus for t = 1,2,...,n, 


nm 
ple: = >- aula). 
j=l 


However, p|x| = |px| = |Ax|, thus we have 
nm nm nm 
pled = | ) ayje5| and | ) ayjtj| = ) iy || 
j=l j=l j=l 
which holds if and only if all a¢121, ay2%0,..., in®n, thus 71, 72,..., 
In (AS At1, A42,---, tn are positive), have the same argument (Prob- 


lem 6), namely, there exists 9 € R such that for all 7 = 1,2,...,n, 
ez; >0; that is, |x| = ea > 0. (5.22) 


Now suppose u and v are two eigenvectors belonging to the eigen- 
value p. By the above discussion, there exist real numbers 4; and 6 
such that |u| = e7!’u > 0 and |u| = e%*v > 0. Set 8 = min; |uz|/|v4| 
and define w = |u| — 8|v|. Then w is a nonnegative vector having at 
least one component 0. On the other hand, 


Aw = A(|ul — Blv|) = Alul — BAlo| = plul — Belvo] = pw. 
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If w # 0, then w > 0 by (5.21), a contraction. Thus w = 0 and 
|u| = B|v| which implies u = av, where a = Bel2—%)4, 

For (5), if |A] = p and Ax = Az, x # 0, then e™x > 0 for some 
6 € R by (5.22). Set y = e%ax. Then Ay = Ay, which is positive. So 
A > 0 and A = p. In other words, if \ 4 p, then |A| <p. ff 


The Perron theorem simply states that for any square positive 
matrix A, the spectral radius p(A) is an eigenvalue of A, known as 
the Perron root. The Perron eigenvalue of a positive matrix is the 
only eigenvalue that attains the spectral radius. Moreover, the posi- 
tive eigenvectors, known as Perron vectors, of the Perron eigenvalue 
are unique up to magnitude. These statements hold for irreducible 
nonnegative matrices. However, it is possible for a nonnegative A to 
have several eigenvalues that have the maximum modulus p(A). 


Problems 


1. Does there exist a positive matrix that is similar to ie ve 


2. Compute explicitly the eigenvalues of the matrix 


a=(378 ar 0<a<l. 


l-a a 


Show that the spectral radius p(A) can be strictly less than the nu- 
merical radius w(A) = maxy,,-1 2* Ax for some a. What if a = $? 


3. Can the inverse of a positive matrix be also positive? 


4. Let A, B,C,D be square matrices of the same size. Show that 


0<B<A, 0<D<C => 0<BD<AC. 


5. Let A be an n-square positive matrix. Show that there exists a unique 
vector « > 0 such that Ax = p(A)a and x1 + 42 +-+-+ a4, =1. 


6. Let p1,p2,...,Pn be positive numbers and let c1,¢2,...,Cn € C. If 
[pic + poco ++++ + Pnen| = piler| + palce| +--+: + pnlen|, 
show that e®c, > 0 for some 0 € R and all k = 1,2,...,n. 


7. Let A > 0 be a square matrix. If ax < Ax < {x for some scalars 
a, 3, and nonnegative vector « 4 0, show that a < p(A) < £. 
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11. 


12. 
13. 


14. 


15. 


16. 


ae 
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. Let A > 0 be a square matrix. If Au = Au for some positive vector 


u and a scalar A, show that A = p(A). Is it true that Av < p(A)v for 
all positive vectors v? 


. Show that p(A) < p(|A]) for any complex square matrix A. 
10. 


Compute p(AB), p(A)p(B), and p(AoB) for each pair of the following 
matrices and conclude that any two of these quantities can be the 
same whereas the third one may be different: 


1 0 1 1)\ /0 0 0 1) /1 0 0 0 

O 1/7? i tsy’?X 1 o7?X 0 O77) 0 077. 0 1 J 
Let A > 0 be a square matrix and B be a (proper) principal subma- 
trix of A. Show that p(B) < p(A). If A > 0, show that p(B) < p(A). 
Let A and B be nxn positive matrices. Show that p(AoB) < p(AB). 


Let A and B be n x n nonnegative matrices. Show that ||Ao Bll.» < 
p(A’B), where ||Ao B||,p is the operator (spectral) norm of Ao B. 
Show by example that ||Ao B||,) < p(AB) does not hold in general. 
Let A> 0 and let M = G a What is p(M)? How many eigenval- 
ues of M are there that have the maximum modulus p(M)? 


Show that every square positive matrix is similar to a positive matrix 
all of whose row sums are constant. 


Let A = G a) and B= ey ae Show that Au # Bu for any positive 


vectors u and v. Use this to show that no two multiplicative elements 
of A and B written in different ways are the same. For example, 
A? B?ABA°B ¢ B? AP BAP B?. 


An M-matriz A is a matrix that can be written as 
A=al—P, where P>0, a> p(P). 
Let A be an M-matrix. Show that 


(a) All principal submatrices of A are M-matrices. 


(c 


(d) If A is nonsingular, then A~! > 0. 


) 
b) All real eigenvalues of A are nonnegative. 
g g 
) The determinant of A is nonnegative. 
) 


CHAPTER 6 


Unitary Matrices and Contractions 


Introduction: This chapter studies unitary matrices and contrac- 
tions. Section 6.1 gives basic properties of unitary matrices, Section 
6.2 discusses the structure of real orthogonal matrices under similar- 
ity, and Section 6.3 develops metric spaces and the fixed-point theo- 
rem of strict contractions. Section 6.4 deals with the connections of 
contractions with unitary matrices, Section 6.5 concerns the unitary 
similarity of real matrices, and Section 6.6 presents a trace inequality 
for unitary matrices, relating the average of the eigenvalues of each 
of two unitary matrices to that of their product. 


6.1 Properties of Unitary Matrices 


A unitary matrix is a square complex matrix satisfying 

ua =. 
Notice that U* = U-! and | det U| = 1 for any unitary matrix U. A 
complex (real) matrix A is called complex (real) orthogonal if 

A A=AA =i, 
Unitary matrices and complex orthogonal matrices are different in 
general, but real unitary and real orthogonal matrices are the same. 
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Theorem 6.1 Let U € M, be a unitary matrix. Then 


1. |\|Ux|| = ||z|| for every x € C”. 
2. || = 1 for every eigenvalue X of U. 
3. U=V diag(A1,...,An)V*, where V is unitary and each |d;| =1. 


4. The column vectors of U form an orthonormal basis for C”. 
Proof. (1) is obtained by rewriting the norm as an inner product: 


V2] = (Ux, U2) = (2, U*Uz) = (2,2) = [la 


To show (2), let x be a unit eigenvector of U corresponding to 
eigenvalue ’. Then, by using (1), 


[Al = [Alle] = [Ae] = ]U 2 = lel] = 1. 


(3) is by the spectral decomposition theorem (Theorem 3.4). 
For (4), suppose that u; is the ith column of U, i = 1,...,n. 
Then the matrix identity U*U = I is equivalent to 


Ul 1 0 


uu 0 1 


This says (u;, u;) usu; = 1ifi=j, and 0 otherwise. 


An interesting observation follows. Note that for any unitary U, 


adj(U) = (det U)U~! = (det U)U*. 


If we partition U as 


ua 
u=(5 or u€eC, 


then, by comparing the (1, 1)-entries in adj(U) = (det U)U*, 
det U; = det U u, 


which is also a consequence of Theorem 2.3. 

As we saw in Theorem 6.1, the eigenvalues of a unitary matrix 
are necessarily equal to 1 in absolute value. The converse is not true 
in general. We have, however, the following result. 
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Theorem 6.2 Let A € M, have all the eigenvalues equal to 1 in 
absolute value. Then A is unitary if, for alla € C”, 


|| Az|| < ||2|]. 
Proof 1. The given inequality is equivalent to 
|Az|| <1, for all unit ¢ € C”. 


This specifies that Omax(A) < 1 (Problem 7, Section 4.1). 

On the other hand, the identity |det AJ? = det(A*A) implies 
that the product of eigenvalues in absolute value equals the product 
of singular values. If A has only eigenvalues 1 in absolute value, then 
the smallest singular value of A has to be 1; thus, all the singular 
values of A are equal to 1. Therefore, A*A = J, and A is unitary. 


Proof 2. Let A = U* DU be a Schur decomposition of A, where U is 
a unitary matrix, and D is an upper-triangular matrix: 


Ai tig... tin 
D= A2 ; 
: 7 tn—1,n 
MW ian. “D An 
where |A;| = 1, i=1,2,...,n, and t;; are complex numbers. 


Take + = U*e, = U*(0,...,0,1)7. Then ||z|| = 1. We have 
|| Az|] = || Denll = (eral? + -+- + [en—ayo|? + [Anl?)"?. 
The conditions ||Az|| < 1 for unit x and |A,| = 1 force each 
tin = O for 7 = 1,2,...,n —1. By induction, one sees that D is a 
diagonal matrix with the A; on the diagonal. Thus, A is unitary, for 


AASV IYUE DU = DD =U =f. i 


We now show a result on the singular values of the principal 
submatrices of a unitary matrix. 
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Theorem 6.3 Let U be a unitary matrix partitioned as 
A B 


where Aismxm and Disnxn. Ifm=n, then A and D have the 


same singular values. [fm <n and if the singular values of A are 
n—-—m™m 
—~_— 
O1,---,;0m, then the singular values of D are o1,...,0m,1,...,1. 


Proof. Since U is unitary, the identities U*U = UU* =I imply 
ATA+C*C=Im, CC*+ DD* = In. 

It follows that 
A*A=In—C*C, DD* =I, —CC". 


Note that CC* and C*C have the same nonzero eigenvalues. Hence, 
In, —CC* and Ij, — C*C have the same eigenvalues except n — m Is; 
that is, A*A and DD* have the same eigenvalues except n — m 1s. 
Therefore, if m =n, then A and D have the same singular values, 
and ifm <n and A has singular values o1,..., 0m, then the singular 
values of D are o1,...,0m, plusn—mls. & 


An interesting result on the unitary matrix U in Theorem 6.3 is 
| det A| = | det Dj. 


In other words, the complementary principal submatrices of a unitary 
matrix always have the same determinant in absolute value. 


Problems 


1. Which of the items in Theorem 6.1 implies that U is a unitary matrix? 
2. Show that for any 01,...,0, € R, diag(e*,...,e%") is unitary. 

3. What are the singular values of a unitary matrix? 

4. Find an m x n matrix V, m #n, such that V*V = In. 

5 


. Let A= es a , where « € C. What are the eigenvalues and singular 
values of A in terms of x? 
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6. Show that for any 6 € R, the following two matrices are similar: 


a) ae cos@ sin@ 
0 —sin@ cosé }* 


[Note: They are both unitary; the first is complex; the second is real.] 


7. Show that any 2 x 2 unitary matrix with determinant equal to 1 is 
similar to a real orthogonal matrix. 


8. Let A and C be m- and n-square matrices, respectively, and let 


Show that M is unitary if and only if B = 0 and A and C are unitary. 
9. Show that the 2 x 2 block matrix below is real orthogonal: 


VAL «KwJilea7 
( ve JAI is A € [0, 1]. 


10. If A is a unitary matrix with all eigenvalues real, show that 
AA=I and A* =A, 


11. If A is similar to a unitary matrix, show that A* is similar to A~t. 
12. Let A € M,, be Hermitian. Show that (A —iI)~1(A +i) is unitary. 


13. Let A be an n x n unitary matrix. If A—T is nonsingular, show that 
i(A — I)~'(A+ J) is Hermitian. 


14. Let a and b be real numbers such that a? — b? = 1, ab £0, and let 


a bi 
K={ 4, ): 


(a) Show that K is complex orthogonal but not unitary; that is, 
K7K=I but K*K¥I. 


(b) Let a= V2 and b = 1. Find the eigenvalues of K. 


(c) Let a = $(e+ e+) and b = $(e—e7'), where e = 2.718.... 
Show that the eigenvalues of K are e and e7!. 


(d) Let a = $(e'+e~*) and b = $(e' —e~'), t€ R. What are the 


eigenvalues of K? Is the trace of K bounded? 
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15. 


16. 


ie 


18. 


19. 


Unitary Matrices and Contractions Chap. 6 


Let A be an n-square complex matrix. Show that A = 0 if A satisfies 
(Ay, y)| < lly], for ally eC”, 


or 
(Ay, y)| < ||Ayll, for all ye C”. 


(Hint: If A 4 0, then (Ayo, yo) 4 0 for some yo € C”.] 
Let A € M,, and a € (0,1). What can be said about A if 


(Ay, wl <@,y)*, for allye C"? 


Let A € M,, have all eigenvalues equal to 1 in absolute value. Show 
that A is unitary if A satisfies 


(Ay, 9) < ||Ayll?, for all y eC”, 


or 
\(Ay.9)| < llyll?, for all ye C”. 


Let A be ann xn complex matrix having the largest and the smallest 
singular values Omax(A) and omin(A), respectively. Show that 


(Ay, Ay) < (y,y), forallyeC", = omax(A) <1 
and 


(y,y) < (Ay, Ay), forallye€C", => omin(A) > 1. 


Let A € M,, have all eigenvalues equal to 1 in absolute value. Show 
that A is unitary if A satisfies, for some real a # 4, 


|(Ay, y)| < (Ay, Ay), for all unit y € C”. 


(Hint: Assume that A is upper-triangular with a1; = 1, ai; > 0 for 
some i > 1. Take y = (cost,0,...,0,sint,0,...,0) and consider the 
behavior of the function f(t) = (Ay, Ay)® —|(Ay, y)| near the origin.] 
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6.2 Real Orthogonal Matrices 


This section is devoted to real orthogonal matrices, the real matrices 

A satisfying AA? = A7A = I. We discuss the structure of real 

orthogonal matrices under similarity and show that real orthogonal 

matrices with a commutativity condition are necessarily involutions. 
We begin with 2 x 2 real orthogonal matrices A: 


A=(° ae a, b,c, dER. 


The identities AA? = A? A = I imply several equations in a, b, c, 
and d, one of which is a? + b? = 1. Since a is real between —1 and 1, 
one may set a = cos@ for some 0 € R, and get b, c, and d in terms of 
0. Thus, there are only two types of 2 x 2 real orthogonal matrices: 


Ge see Ge sin 0 ). OER, (6.1) 


—sin@ cosé sin0d —cosé 


where the first type is called rotation and the second reflection. 
We show that a real orthogonal matrix is similar to a direct sum 
of real orthogonal matrices of order 1 or 2. 


Theorem 6.4 Every real orthogonal matrix is real orthogonally sim- 
ilar to a direct sum of real orthogonal matrices of order at most 2. 


Proof. Let A be ann xn real orthogonal matrix. We apply induction 
on n. If n = 1 or 2, there is nothing to prove. 

Suppose n > 2. If A has a real eigenvalue \ with a real unit 
eigenvector x, then 


Az=dAn, «£0 = a ATAd=*2"z. 


Thus, \ = +1, say, 1. Extend the real unit eigenvector x to a real 
orthogonal matrix P. Then P* AP has (1, 1)-entry 1 and 0 elsewhere 
in the first column. Write in symbols 


7 flow 
prap=(1 8). 
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The orthogonality of A implies u = 0. Notice that A, is also real 
orthogonal. The conclusion then follows from an induction on A. 
Assume that A has no real eigenvalues. Then for any nonzero 
x € R”, vectors x and Ax cannot be linearly dependent. Recall the 
angle 7, between two vectors x and y and note that 2,4 = ZAz,Ay 
due to the orthogonality of A. Define f(x) to be the angle function 


_1 (&, Az) 

PCs Gee ee a 0 a eee 

a [|ar|||| Aarl| 

Then f(x) is continuous on the compact set S = {x € R”: ||x|| = 1}. 
Let 00 = Zg9,Ary be the minimum of f(x) on S. Let yo be the 

unit vector in Span{xg, Avo} such that Zg5, 4) = Zyo,Ary- Then by 

Theorem 1.10, we have 


00 . @& 
90 < Ly,Ayo S Zy,Aeo + ZAzo,Ayo = 2 a 2 = 4 
and Azo € Span{yo, Ayo}. Thus, Ayo has to belong to Span{zo, yo}. 
It follows, because Az is also in the subspace, that Span{zo, yo} 
is an invariant subspace under A. We thus write A, up to similar- 
ity by taking a suitable orthonormal basis (equivalently, via a real 


orthogonal matrix), as 
_{ To 0 
4=(9 a): 


where Jo is a 2 X 2 matrix, and B is a matrix of order n — 2. Since 
A is orthogonal, so are Tp and B. An application of the induction 
hypothesis to B completes the proof. 


Another way to attack the problem is to consider the eigenvectors 
of A. One may again focus on the nonreal eigenvalues. Since A is 
real, the characteristic polynomial of A has real coefficients, and the 
nonreal eigenvalues of A thus occur in conjugate pairs. Furthermore, 
their eigenvectors are in the forms a+ Gi and a— Gi, where a and 8 
are real, (a, 3) = 0, and Aa = aa — 02, AB = ba+ af for some real 
a and b with a? + b? = 1. Matrix A will have the desired form (via 
orthogonal similarity) by choosing a suitable real orthonormal basis. 
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Our next theorem shows that a matrix with a certain commuting 
property is necessarily an involution. For this purpose, we need a 
result that is of interest in its own right. 

If two complex square matrices Ff’ and G of orders m and n, re- 
spectively, have no eigenvalues in common, then the matrix equation 
FX — XG =0 has a unique solution X = 0. 

To see this, rewrite the equation as FX = XG. Then for every 
positive integer k, F*X = XG". It follows that 


f(F)X = Xf(G) 


for every polynomial f. In particular, we take f to be the char- 
acteristic polynomial det(AI — F’) of F; then f(F) = 0, and thus 
X f(G) = 0. However, because F and G have no eigenvalues in 
common, f(G) is nonsingular and hence X = 0. 


Theorem 6.5 Let A and U be real orthogonal matrices of the same 
size. IfU has no repeated eigenvalues and if 


UA= AU’, 
then A is an involution, that is, A? = I. 


Proof. By the previous theorem, let P be a real orthogonal matrix 
such that P~'UP is a direct sum of orthogonal matrices V; of order 
1 or 2. The identity UA = AU” results in 


(P-“UP)(P1AP) = (P'AP)\(P"'UP)?. 


Partition P~'AP conformally with P~'UP as P-'AP = (Bi;), 
where the Bj; are matrices whose number of rows (or columns) is 
1 or 2. Then UA = AU” gives 


V;Bij = BVP, i,9=1,...,k. (6.2) 
Since U, thus P~!UP, has no repeated eigenvalues, we have 
By =0, t#j. 
Hence P~!AP is a direct sum of matrices of order no more than 2: 


PAP = By @---@ Brg. 
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The orthogonality of A, thus P~'AP, implies that each B;; is either 
an orthogonal matrix of order 1 or an orthogonal matrix of form 
(6.1). Obviously B?, = I if By is +1 or a reflection. Now suppose 
By is a rotation; then V; is not a rotation. Otherwise, V; and Bi; 
are both rotations and hence commute (Problem 4), so that V? Bi; = 
By and V? = I. Using the rotation in (6.1), we have V; = +1, 
contradicting the fact that V; has two distinct eigenvalues. Thus, V; 
is a reflection, hence orthogonally similar to diag(1,—1). It follows 
that B,; is similar to diag( 1) by (6.2). In either case B? = I. 
Thus, (P~!AP)? = I and A? = =!. 


ima 


Problems 


1. Give a 2 x 2 matrix such that A? = J but A*A 4 I. 


2. When is an upper-triangular matrix (complex or real) orthogonal? 


3. If A is a 2 x 2 real matrix with a complex eigenvalue \ = a + bi, 
a, b€ R, b #0, show that A is similar to the real matrix 


(4a): 


4. Verify that A and B commute; that is, AB = BA, where 


a=( cos a eal B=( cos 6 <a a, BER. 


—sina cosa —sinB cos 


5. Show that a real matrix P is an orthogonal projection if and only if P 
is orthogonally similar to a matrix in the form diag(1,...,1,0,...,0). 


6. Show that, with a rotation (in the ry-plane) of angle 6 written as 


a. s cos@ sind 
6~\ _sin@ cosé }? 


(d) a reflection is expressed as ° - Io. 
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7. 


10. 


11. 


12. 


13. 


14. 


If A is a real orthogonal matrix with det A = —1, show that A has 
an eigenvalue —1. 


. Let A be a 3 x 3 real orthogonal matrix with det A = 1. Show that 


(tr A _ by? + Say = Qyi)” =A. 


t<j 


. Let A be an n x n real matrix. Denote ss(A) = )*),_, a?,. Show 


i,j=i Vij: 
that A is real orthogonal if and only if 


ss(A7X A) =ss(X), forallnxn real X. 


If A € M,, is real symmetric and idempotent, show that 0 < a;; < 1 
for each i and |a;;| < 4 for alli 4 7. Moreover, if a;; = 0 or 1, then 
aig = afi = 0 for all with j x 1. 

Let A be an nx n real orthogonal matrix such that rank (A—J,) = 1. 
Show that A is real orthogonally similar to diag(—1,1,..., 1). 


Let A = (a;;) be ann x n real matrix and C;,; be the cofactor of a;;, 
i,j =1,2,...,n. Show that A is orthogonal if and only if det A = +1 
and aj; = Cy; if det A= 1, aij = —Ci,; if det A = —1 for all i, j. 


Let A = (a;;) 4 0 be ann x n real matrix and C;,; be the cofactor of 
aij, %,j =1,2,...,n. Ifn > 2, show that A is orthogonal if a,; = Ci; 


for all i,j, or aj; = —Cj, for all 2, 7. 
Let 
0 0 -1 1 0 1 0 0 
1 0 0 1 1 —-1 0 0 0 
a V2 1 1 0 0 |’ in 0 0 0 1 
1 -1 0 0 0 0 -1 0 


(a) Show that UA = AU”. 
(b) Find the eigenvalues of U. 
(c) Show that A? ¥ I. 
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6.3. Metric Space and Contractions 


A metric space consists of a set M and a mapping 
d:MxMvwHR, 
called a metric of M, for which 


1. d(x,y) > 0, and d(x, y) = 0 if and only if x = y, 
2. d(x, y) = d(y, x) for all x and y in M, and 
3. d(x, y) < d(x, z) +d(z,y) for all x, y, and z in M. 


Consider a sequence of points {z;} in a metric space M. If for 
every € > 0 there exists a positive integer N such that d(aj,x;) < € 
for all i, 7 > N, then the sequence is called a Cauchy sequence. A 
sequence {x;} converges to a point x if for every € > 0 there exists 
a positive integer N such that d(x,x;) < € for alli > N. A metric 
space M is said to be complete if every Cauchy sequence converges 
to a point of M. For instance, {c"}, 0 < c < 1, is a Cauchy sequence 
of the complete metric space R with metric d(x, y) = |x — y|. 

C” is a metric space with metric 


d(x, y) = I|x _ ip Zz, Ye Cc (6.3) 


defined by the vector norm 


n 1/2 
lel = (Sols?) , @ec”, 
i=1 


Let f : M+ M be a mapping of a metric space M with metric 
d into itself. We call f a contraction if there exists a constant c with 
0<c<1 such that 


d( f(x), f(y)) < cd(z,y), forallaz, ye M. (6.4) 


If0<c< 1, we say that f is a strict contraction. 
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For the metric space R with the usual metric 
d(x, y) = |x — y|, x, yER, 


the mapping x +> sin 5 is a strict contraction, since by the sum-to- 
product trigonometric identity (or by using the mean value theorem) 
r+y. x“-y 


_ o« a ¥ 2 
in~ —sin== in 
S 5 S 5 cos A s 1 


together with the inequality | sin z| < ||, we have for all z, yin R 


whe . Y 1 
sin sin =| < —|x —y|. 
2 2 2 


The mapping x +> sin is a contraction, but not strict, since 


sin x 


lim 


z0 £ 


= 1, 


A point x in a metric space M is referred to as a fixed point of 
a mapping f if f(z) = x. The following fixed-point theorem of a 
contraction has applications in many fields. For example, it gives a 
useful method for constructing solutions of differential equations. 


Theorem 6.6 Let f: Mt> M be a strict contraction mapping of a 
complete metric space M into itself. Then f has one and only one 
fixed point. Moreover, for any point x © M, the sequence 


x, fle), P(x), f(a), .. 
converges to the fixed point. 
Proof. Let x be a point in M. Denote d(z, f(x)) = 5. By (6.4) 
d(f"(a), f° (x)) < 06, n>1. (6.5) 


The series )>?_, c” converges to 7%, for every fixed c, 0 < c < 1. 
Hence, the sequence f"(x),n = 1,2,..., is a Cauchy sequence, since 


df (x), F"(«)) d(fm(a), FPP (a)) +--+ + d(fP-*(a), F"(z)) 


< 
< (e” ae cle ch) 6, 
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Thus, the limit lim,_,.. f"(x) exists in M, for M is complete. Let 
the limit be X. We show that X is the fixed point. Note that a 
contraction mapping is a continuous function by (6.4). Therefore, 
= ; n — jj n+1 = 
F(X) = flim f"(2)) = tim fl! (@) = X. 
If Y € M is also a fixed point of f, then 
It follows that d(X,Y) =0; that is, X =Y ifO0<c<l. BO 


Let A be an m xn complex matrix, and consider A as a mapping 
from C” into itself defined by the ordinary matrix-vector product; 
namely, Az, where x € C”. Then inequality (6.4) is rewritten as 

|| Ax — Ay] < ellz — yl. 
We show that A is a contraction if and only if Omax(A), the largest 
singular value of A, does not exceed 1. 


Theorem 6.7 Matrix A is a contraction if and only if Omax(A) < 1. 


Proof. Let A be m x n. For any z, y € C”, we have (Section 4.1) 
|| Ax — Ay|| = || A(x — y)|| S omax(A)]|e — yl. 


It follows that A is a contraction if omax(A) < 1. Conversely, suppose 
that A is a contraction; then for some c,0 <c¢<1,andallz,y €C”, 


|| Aa — Ayl| < ella — yl]. 


In particular, || Az|| < cl|z|| for 2 € C”. Thus, omax(A) <c<1. 


Note that unitary matrices are contractions, but not strict. One 
can also prove that a matrix A is a contraction if and only if 


2 : IA 
A*A<I, AA* <I, or & I ) > 0. 

Here X > Y, or Y < X, means that X — Y is positive semidefinite. 

We conclude this section by presenting a result on partitioned 
positive semidefinite matrices, from which a variety of matrix in- 
equalities can be derived. 

Let A be a positive semidefinite matrix. Recall that A!/? is the 
square root of A; that is, A!/? > 0 and (A‘/?)? = A (Section 3.2). 
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Theorem 6.8 Let L and M be positive semidefinite matrices. Then 


( - y ) >0 Ss X=L1"cmM}1/2 for some contraction C. 


Proof. Sufficiency: If X = L'/2CM1/2, then we write 


bh e\ fie oO Ic i 
X* M)}) \ Oo MV c* I 0 miu? }° 


For the positive semidefiniteness, it suffices to note that (Problem 9) 


mock > € 7 ) 20. 


For the other direction, assume that D and M are nonsingular and 
let C = L~/2X M—"/?. Here the exponent —1/2 means the inverse 
of the square root. Then X = L!/2C'‘M1/? has the desired form. We 
need to show that C' is a contraction. First notice that 


CCa=M Ire x. 


Notice also that, since the partitioned matrix is positive semidefinite, 


tx L 0 
kK = > 
- e ae & nee SO 


where 


Thus, M — X*L~!X > 0 (Problem 9, Section 3.2). Therefore, 
MOP OM = RM a aa ee XS, 


That is, —C*C > 0, and thus C is a contraction. The singular case 
of L and M follows from a continuity argument (Problem 17). I 


We end this section by noting that targeting the submatrices X 
and X* in the upper-right and lower-left corners in the given par- 
titioned matrix by row and column elementary operations for block 
matrices is a basic technique in matrix theory. It is used repeatedly 
in later chapters of this book. 
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Problems 
1. What are the differences between vector space, inner product space, 


10. 


11. 


normed space, and metric space? 


. Following the proof of Theorem 6.6, show that 


6 
l-e¢ 


d(x,X) < 


. Show that a contraction is a continuous function. 


If f is a strict contraction of a complete metric space, show that 
d( f(x), f"(y)) 0, as n+ 00, 


for any fixed x and y in the space. 


. Show that the product of contractions is again a contraction. 


. Is the mapping x +> sin(2a) a contraction on R? How about the 


mappings 7+> 2sinz, TH $ sing, and 7 +> 2sin 5? 


Construct an example of a map f for a metric space such that 
d(f(x), f(y)) < d(x, y) for all « 4 y, but f has no fixed point. 


. If Ac M, is a contraction with eigenvalues (A), show that 


|det A) <1, |A(A)| <1, |x*Ax| <1 for unit 2 € C”. 


. Show that an m x n matrix A is a contraction if and only if 


x*(A*A)a <1 for every unit x € C”. 


|| Az|| < ||a|| for every « € C”. 


Let A be an n X n matrix and B be an m X n matrix. Show that 


A B* : 
& 7 )20 & A>B*B. 


Let A€ M,,. If omax(A) < 1, show that [,, — A is invertible. 
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12. 


13. 


14. 


15. 


16. 


17. 
18. 


Let A and B be n-square complex matrices. Show that 
AX A< B*B 
if and only if A = CB for some contraction matrix C. 


Consider the complete metric space R? and let 


A=(} i de (0,1). 


Discuss the effect of an application of A to a nonzero vector vu € R?. 
Describe the geometric orbit of the iterates A”v. What is the fixed 
point of A? What if the second 4 in A is replaced with py € (0,1)? 


Let A € M,, be a projection matrix; that is, A? = A. Show that 
|| Aa|| < |||] for all « € C” if and only if A is Hermitian. 


Let A and B be n-square positive definite matrices. Find the condi- 
tions on the invertible n-square matrix X so that 


A xX* Aq! x 
Let A be a matrix. If there exists a Hermitian matrix X such that 


I+xX A 
(ux Ae 


show that 
\(Ay,y)| < (yy), for all y. 


Prove Theorem 6.8 for the singular case. 


Show that for any matrices X and Y of the same size, 
Re ae XX OCT Eye 
for some contraction C. Derive the matrix inequality 


| det(X +Y)|? < det(I + X.X*) det(I+Y*Y). 
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6.4 Contractions and Unitary Matrices 


The goal of this section is to present two theorems connecting con- 
tractions and unitary matrices. We focus on square matrices, for oth- 
erwise we can augment by zero entries to make the matrices square. 
We show that a matrix is a contraction if and only if it can be em- 
bedded in a unitary matrix, and if and only if it is a (finite) convex 
combination of unitary matrices. 


Theorem 6.9 A matrix A is a contraction if and only if 


v=(4 4) 


is unitary for some matrices X, Y, and Z of appropriate sizes. 
Proof. The sufficiency is easy to see, since if U is unitary, then 
(USI = AASYVY =f = AAKT. 


Thus, A is a contraction. For the necessity, we take 


a A (r—-Ay ye 
~ \ (L- Ata)? —A* 


and show that U is a unitary matrix as follows. 

Let A = VDW be a singular value decomposition of A, where 
V and W are unitary, and D is a nonnegative diagonal matrix with 
diagonal entries (singular values of A) not exceeding 1. Then 


(I — AA*)¥? — vir — D?)/?y*, (1 — A* A)? = W* (I — D2)? Ww. 
In as much as D is diagonal, it is easy to see that 

DI= PY? =F" Dp. 
Multiplying by V and W from the left and right gives 


VD(I — D?)?w =V(I— D?)'/? Dw 
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or equivalently 
A(I — A* A)¥? = (I AA*)¥/?24. 
With this, a simple computation results in U*U =J. UW 
Recall from Problem 8 of Section 4.1 that for any A, Be M, 


Cel A + B) < Ginax( A) + Priel B 
Thus, for unitary matrices U and V of the same size and t € (0,1), 
Omax(tU + (1 —t)V) < tomax(U) + (1 — thomax(V) = 1. 


In other words, the matrix tU + (1 — t)V, a convex combination of 
unitary matrices U and V, is a contraction. 

Inductively, a convex combination of unitary matrices is a con- 
traction (Problem 3). The converse is also true. 


Theorem 6.10 A matrix A is a contraction if and only if A is a 
finite convex combination of unitary matrices. 


Proof. As discussed earlier, a convex combination of unitary matri- 
ces is a contraction. Let A be a contraction. We show that A is a 
convex combination of unitary matrices. The proof goes as follows. 
A is a convex combination of matrices diag(1,...,1,0,...,0); each 
matrix in such a form is a convex combination of diagonal (unitary) 
matrices with diagonal entries +1. We then reach the conclusion 
that A is a convex combination of unitary matrices. 

Let A be of rank r and A = UDV be asingular value decomposi- 
tion of A, where U and V are unitary, D = diag(o1,...,07,0,...,0) 
with 1 > 0, > 09 >--: >0, > 0. We may assume r > 0. 

If D is a convex combination of unitary matrices, say, W;, then 
A is a convex combination of unitary matrices UW;V. We may thus 
consider the diagonal matrix A = diag(o1,...,0,,0,...,0). Write 


Ao = dag (oi;2<-,07;0)...,0) 
= (1—0,)0+ (o — 02) diag(1,0,...,0) 
+ (02 — 03) diag(1,1,0,...,0) +--- 
r—1 
—_— 
+ (op_1 — o,) diag(1,...,1,0,...,0) 


T 


. —_—_— 
+ or diag(1,...,1,0,...,0). 
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That is, matrix A is a (finite) convex combination of matrices E; = 
diag(1,...,1,0,...,0) with 7 copies of 1, where 0 <i <r. We now 
show that such a matrix is a convex combination of diagonal matrices 
with entries +1. Let 


n—t 
F, = diag(0,...,0,—1,...,—1) 
Then 
il il 
E; = a 5 (Fi + Fj) 


is a convex combination of unitary matrices J and E; + F;. 

It follows that if 0, < 1, then the matrix diag(o1,...,0,,0,...,0), 
thus A, is a convex combination of diagonal matrices in the form 
diag(1,...,1,0,...,0), which in turn is a convex combination of (di- 
agonal) unitary matrices. The proof is complete by Problem 11. & 


Problems 


1. Let ¢ € [0,1]. Write ¢ as a convex combination of 1 and —1. Write 
matrix A as a convex combination of unitary matrices, where 


+ 0 
3 


2. Let A and yu be positive numbers. Show that for any t € [0, 1], 
Au < (tA + (1 —t)p) (tu + (1-2#))). 


3. Show by induction that a finite convex combination of unitary ma- 
trices of the same size is a contraction. 


4. For any two matrices U and V of the same size, show that 
UV +VU a wU+V'v. 
In particular, if U and V are unitary, then 
UV+VU <2. 
Also show that for any t € [0,1] and unitary U and V, 


(tU + (1-t)V)"(tU+(1-8)V) <I. 
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5. 


10. 


11. 


Prove or disprove that a convex combination, the sum, or the product 
of two unitary matrices is a unitary matrix. 


. If A is a contraction satisfying A + A* = 27, show that A = J. 


. Let A be a complex contraction matrix. Show that 


Lae) 


and 
(I — AA*)1/2 A 
_— A* T@=A Ay? 


are unitary matrices. 


. Let B,, be the m x m backward identity matrix. Show that 


i, FT, 
1 ( I I ) 1 be 7 
at ae and —={ 0 v2 0 
V2\ Bm —Bin V2\ BB, 0 —Bm 


are 2m- and (2m + 1)-square unitary matrices, respectively. 


. Let 0, >--: > 0, > 0. Show that diag(oy,...,0,,0,...,0) is a con- 


vex combination of matrices diag(a,,...,01,0,...,0) with & copies 
of o1,k =1,2,...,r. 


Let o be such that 0 <0 < 1. Show that diag(c,...,0,0,...,0) isa 
convex combination of the following diagonal unitary matrices: 


: J, 
I, G; = diag(—1,...,-1,1,...,1), —Gi, —I. 
[Hint: Consider the cases for 0 <0 < 4 and $<oa<1) 


Let P,,..., Pn be a set of matrices. If each P; is a convex combi- 
nation of matrices Qi,...,Qn, show that a convex combination of 
P,,...,Pm is also a convex combination of Q1,...,Qn. 
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6.5 The Unitary Similarity of Real Matrices 


We show in this section that if two real matrices are similar over 
the complex number field C, then they are similar over the real R. 
The statement also holds for unitary similarity. Precisely, if two real 
matrices are unitarily similar, then they are real orthogonally similar. 


Theorem 6.11 Let A and B be real square matrices of the same 
size. If P is a complex invertible matrix such that P~'AP = B, then 
there exists a real invertible matric Q such that Q~'AQ = B. 


Proof. Write P = P, + Poi, where P; and Py, are real square ma- 
trices. If Po = 0, we have nothing to show. Otherwise, by rewriting 
P-'AP = Bas AP = PB, we have AP, = P,B and AP» = PyB. It 
follows that for any real number t, 


A(Pi + tP2) = (Pi + tP2)B. 


Because det(P; +tP2) = 0 for a finite number of t, we can choose 
a real t so that the matrix Q = P, + tP2 is invertible. Thus, A and 
B are similar via the real invertible matrix Q. I 


For the unitary similarity, we begin with a result that is of interest 
in its own right. 


Theorem 6.12 Let U be a symmetric unitary matria, that is, UT = 
U and U* =U~!. Then there exists a compler matrix S satisfying 
i as, 
2. S is unitary. 
3. S is symmetric. 
4. S commutes with every matrix that commutes with U. 


In other words, every symmetric unitary U has a symmetric unitary 
square root that commutes with any matrix commuting with U. 
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Proof. Since U is unitary, it is unitarily diagonalizable (Theorem 6.1). 
Let U = VDV*%, where V is unitary and D = a,J, 6---@apzly, with 
all a; distinct and J; identity matrices of certain sizes. Because U 
is unitary and hence has eigenvalues of modulus 1, we write each 
aj = e’®i for some 6; real. 

Now let S=V(b, ®:--@byly)V*, where b; = e'9;/2, Obviously 
S is a unitary matrix and S$? = U. 

If A is a matrix commuting with U, then V*AV commutes with 
D. It follows that V*AV = A; @---@ Ag, with each A; having the 
same size as I; (Problem 4). Thus, S commutes with A. 

Since U = U", this implies that V'V commutes with D, so that 
VV commutes with b, J; @---@ by ly. Thus, S is symmetric. 


Theorem 6.13 Let A and B be real square matrices of the same 
size. If A= UBU* for some unitary matrix U, then there exists a 
real orthogonal matriz Q such that A= QBQ’. 


Proof. Since A and B are real, we have UBU* = A = A= UBU’". 
This gives U'UB = BU’U. Now that UTU is symmetric unitary, by 
the preceding theorem it has a symmetric unitary square root, say 
S; that is, U'U = $?, which commutes with B. 

Let Q = US! or U = QS. Then Q is also unitary. Notice that 


gtq = (US“)T(US") = 8S UTUS"! =I. 


Hence Q is orthogonal. Q is real, for Q7 = Q-! = Q* yields Q = Q. 
Putting it all together, S and B commute, S is unitary, and Q is real 
orthogonal. We thus have 


A 


UBU* = (US)(SB)U* = Q(BS)U* 
= QB(S')*U* = QBQ* = QBQ’. = 


Problems 


1. If A? is a unitary matrix, is A necessarily a unitary matrix? 
y , y y 


2. If A is an invertible matrix with complex, real, rational, or integer 
entries, is the inverse of A also a matrix with complex, real, rational, 
or integer entries, respectively? 
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. Let A be a normal matrix. Show that there exists a normal matrix 


B such that B? = A. Is such a B unique? 


. If matrix A commutes with B = bil, @--- 6 byl;, where the J; 


are identity matrices and all b; are distinct, show that A is of the 
form A = A; 6---@ Ag, where each A; has the same size as the 
corresponding J;. 


. Show that A and B are similar via a real invertible matrix Q, where 


a-(43): 2-(5 8): 


Find Q. Are they real orthogonally (or unitarily) similar? 


. If two matrices A and B with rational entries are similar over the 


complex C, are they similar over the real R? The rational Q? 


. Let Q be a real orthogonal matrix. If \ is an imaginary eigenvalue of 


Q and u= 2+ yi is a corresponding eigenvector, where x and y are 
real, show that x and y are orthogonal and have the same length. 


. Let b and c be complex numbers such that |b] 4 |c|. Show that for 


any complex numbers a and d, matrix A cannot be normal, where 


tel a) 


. Show that if A is a real symmetric matrix, then there exists a real 


orthogonal Q such that Q7 AQ is real diagonal. Give an example of 
areal normal matrix that is unitarily similar to a (complex) diagonal 
matrix but is not real orthogonally similar to a diagonal matrix. 
Let 
1 0 O 
, B=| 0 2 4+v2 
0 0 2 


1 0 
A=]| 0 2 
0 O 


NR 


(a) What are the eigenvalues and eigenvectors of A and B? 
(b) Why are A and B similar? 

) Show that 1 is a singular value of B but not of A. 

) 


Show that A cannot be unitarily similar to a direct sum of 
upper-triangular matrices of order 1 or 2. 


(e) Can A and B be unitarily similar? 
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6.6 A Trace Inequality of Unitary Matrices 


The set of all complex matrices of the same size forms an inner 
product space over C with respect to the inner product defined as 


(A, B),, = tr(B* A). 


In what follows we consider the inner product space M,, over C 
and present a trace inequality for complex unitary matrices, relating 
the average of the eigenvalues of each of two unitary matrices to that 
of their product. For this purpose, we first show an inequality for an 
inner product space V, which is of interest in its own right. 


Theorem 6.14 Let u, v, and w be unit vectors in V over C. Then 


V1 —|(u,v)? < V1 -|(u,w)? + V1 —|(w, 2). (6.6) 


Equality holds if and only if w is a multiple of u or v. 


Proof. To prove this, we first notice that any component of w that is 
orthogonal to the span of u and v plays no role in (6.6); namely, we 
really have a problem in which u and v are arbitrary unit vectors, w 
is in the span of u and v, and (w, w) < 1. The case w = 0 is trivial. If 
w #0, scaling up w to have length 1 diminishes the right-hand side 
of (6.6), so we are done if we can prove inequality (6.6) for arbitrary 
unit vectors u, v, and w with w in the span of u and v. The case in 
which u and v are dependent is trivial. Suppose u and v are linearly 
independent, and let {u,z} be an orthonormal basis of Span{u, v}, 
so that v = pu+ Az and w = au+ &z for some complex numbers p, 
A, a, and 2. Then we have 


JAP? + |u|? =1 and fal? + |B)? =1. 


Use these relations and the arithmetic-geometric mean inequality, 
together with |c| > Re(c) for any complex number c, to compute 
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AB] = 516I(lul? + AP + lal? + 16?) 
> |A3|(|X8| + lanl) 
= |A8P + [Asan 
= |AsP + [Adon 
> |AB) + Re( Bani, 


so that —2|6| < —2|AB|? — 2Re(ABaji). Thus, we have 


(IAl—|8))? = [AP — 2|A8| +16)? 
< [AP +|6|? — 2|8)? — 2Re(ABaj) 
= A)? +|6/?(1 —|A)?) — |AB)? — 2Re(ABaj) 
= (1—|n\?) + (8) lul? — |AB)? — 2Re(ABazi) 


= 1-|p\?(1— |6)?) — AB)? — 2Re(ABaji) 
= 1 -|po|? — |S? — 2Re(ABap) 
= 1—|aji+ BY). 


|A| — [8] < 1 — laf + Al? , 
|A| < [8] + 1/1 — lai + BAI? , 


which is the same as 


V1—|u? < V1—lal? + /1- loft Al. 


Because ||? = |(u,v)|?, Jal? = |(u, w)|?, and 


This gives 


or 


lagi t+ BAP? = |(au + Bz, put Az)? = |(w, v)/?, 


the inequality (6.6) is proved. 

Equality holds for the overall inequality if and only if equal- 
ity holds at the two points in our derivation where we invoked the 
arithmetic-geometric mean inequality and |c| > Re(c). Thus, equal- 
ity holds if and only if |\| = |6| and |a| = ||, as well as Re(AZaji) = 
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|ABaji|. The former is equivalent to having \ = e’78 and p = e’?a 
for some real numbers 6 and @, while the latter is then equivalent 
to Re(|aA|?(e°-%) —1)) = 0. Thus, a = 0, 6 = 0, or e” = e*?, go 
equality in (6.6) holds if and only if either w is a multiple of u (6 = 0) 
or w is a multiple of v (a = 0 or e” =e’). I 

Now consider the vector space M,, of all n x n complex matrices 
with the inner product (A, B),, = tr(B*A) for A and B in Mp. 

Let U and V be n-square unitary matrices. By putting 


in (6.6), and writing m(X) = 1 tr X for the average of the eigenvalues 


—~ n 
of the matrix X € My, we have the following result. 


Theorem 6.15 For any unitary matrices U and V, 
V1 |m(UV)P < V1 |m() PF + V1 |m(V)? 


with equality if and only if U or V is a unitary scalar matrix. 


Problems 


1. Let U be an m x n matrix such that U*U = I[,,. Show that 


tr(UAU*) =trA, for any Ac M,. 
How about 
tr(U* AU) =trA, for any Ac M,? 
2. Show that for any square matrix A and positive integers p and q 
[tr AP*+4)? < tr ((A*)PAP) tr ((A*)2AY) 
and 
tr((A*)PA?) < (tr(A"A))?, tr(A*A)? < (tr(A"A))?. 


3. IfU isannxn nonscalar unitary matrix with eigenvalues \1,..., An, 
show that the following strict inequality holds: 


<1. 


Arte tAn 
n 
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. Let V be any square submatrix of a unitary matrix U. Show that 


|A(V)| < 1 for any eigenvalue A(V) of V. 


. For unit vectors u and v in an inner product space V over C, define 


<u,v= cos”! |(u,v)|. Show that for any unit vector w in V, 


sin <yy< SiN <yw +SM <yy - 


. Let D = diag(a1,...,an). Show that for any n-square unitary U, 


min |a;| <|A(DU)| < max |ail, 
¥) a 


where \(DU) is any eigenvalue of DU. 


. With ||Allo = (A, A), = V/tr(A*A), show that for any A € M, 


A+ A* ‘ 


2 


2 
| 


All? = 
1418 = | : 


a 


2 


. Let U be a unitary matrix. If \ and yw are two different eigenvalues 


of U, show that their eigenvectors u and v are orthogonal. Further 
show that au + bv cannot be an eigenvector of U if ab 4 0. 


. Let P? be the collection of all the unit vectors in C”. Define 


d(x,y) =/1—|(x,y)|?,. 2, ye P?. 


Show that d(x, y) = d(y, x) for all x and y in P? and that d(x, y) =0 
if and only if 2 = cy for some complex number c with |c| = 1. 


For nonzero vectors u, v € C?, define 


(u,v)? 
lull? [le|l? 


d(u,v) =4/1— 
Show that for any u, v, w € C?, and A, pw € C, 
(a) d(du, yo) = a(u, »), 
(b) d(u,v) < d(u, w) + d(w, v), 
(c) d(u, v) = d(2u, 2v), where z, = 2 ifx = (a1, 22)" € C?, 21 £0, 


\Zu — 2v| 


V+ leu!) + [20?) 


Bl iy, Zu) nd 
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CHAPTER 7 


Positive Semidefinite Matrices 


Introduction: This chapter studies the positive semidefinite matri- 
ces, concentrating primarily on the inequalities of this type of matrix. 
The main goal is to present the fundamental results and show some 
often-used techniques. Section 7.1 gives the basic properties, Section 
7.2 treats the Lowner partial ordering of positive semidefinite matri- 
ces, and Section 7.3 presents some inequalities of principal submatri- 
ces. Section 7.4 derives inequalities of partitioned positive semidef- 
inite matrices using Schur complements, and Sections 7.5 and 7.6 
investigate the Hadamard product of the positive semidefinite ma- 
trices. Finally, Section 7.7 shows the Cauchy—Schwarz type matrix 
inequalities and the Wielandt and Kantorovich inequalities. 


7.1 Positive Semidefinite Matrices 


An n-square complex matrix A is said to be positive semidefinite or 
nonnegative definite, written as A > 0, if 


g*Az >0, for alla eC". (7.1) 


A is further called positive definite, symbolized A > 0, if the strict 
inequality in (7.1) holds for all nonzero x € C”. 
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It is immediate that if A is an n x n complex matrix, then 
A>0 s X*AX>0 (7.2) 


for every n X m complex matrix X. (Note that one may augment a 
vector « € C” by zero entries to get a matrix of size n x m.) 

The following decomposition theorem (see the spectral decompo- 
sition theorem in Chapter 3) of positive semidefinite matrices best 
characterizes positive semidefiniteness under unitary similarity. 


Theorem 7.1 Ann xn complex matrix A is positive semidefinite 
if and only if there exists ann x n unitary matric U such that 


A=U* diag(\1,...,n)U, (7.3) 


where the A; are the eigenvalues of A and are all nonnegative. In 
addition, if A> 0 then det A> 0. A is positive definite if and only 
if all the A; in (7.8) are positive. Besides, if A >0 then det A > 0. 


Positive semidefinite matrices have many interesting and impor- 
tant properties and play a central role in matrix theory. 


Theorem 7.2 Let A be an n-square Hermitian matrix. Then 


1. A is positive definite if and only if the determinant of every 
leading principal submatrix (leading minor) of A is positive. 

2. A is positive semidefinite if and only if the determinant of every 
(not just leading) principal submatrix of A is nonnegative. 


Proof. Let A; be a k x k principal submatrix of A ¢ M,. By per- 
muting rows and columns we may place A, in the upper-left corner 
of A. In other words, there exists a permutation matrix P such that 
Ay is the (1, 1)-block of PAP. 

If A > 0, then (7.1) holds. Thus, for any x € C*, 


a*A,x = y*Ay>0, where y= P(>) €C". 


This says that Ax is positive semidefinite. Therefore, det A, > 0. 
The strict inequalities hold for positive definite matrix A. 
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Conversely, if every principal submatrix of A has a nonnegative 
determinant, then the polynomial in \ (Problem 19 of Section 1.3) 


detOd — Ay =X" 6)" 4 ON? et C1)" det A, 


has no negative zeros, since each 6;, the sum of the determinants of 
all the principal matrices of order 7, is nonnegative. 
The case where A is positive definite follows similarly. I 


Note that A being Hermitian in Theorem 7.2 is necessary. For 
instance, all the determinants of the principal submatrices of the 
matrix A = é 4 are positive, but A is not positive (semi)definite. 

As aside product of the proof, we see that A is positive (semi)definite 
if and only if all of its principal submatrices are positive (semi)definite. 

It is immediate that A>0 => ay > 0 and that aya;; > lag|* 
for 1 ~ 7 by considering 2 x 2 principal submatrices 


( Qi Ai ) > 0. 
Aji Ajj 
Thus, if some diagonal entry a;; = 0, then a;; = 0 for all 7, and hence, 
ap; = O for all h, in as much as A is Hermitian. We conclude that 
some diagonal entry a;; = 0 if and only if the row and the column 
containing a;; consist entirely of 0. 

Using Theorem 7.1 and the fact that any square matrix is a prod- 


uct of a unitary matrix and an upper-triangular matrix (QR factor- 
ization; see Section 3.2), one can prove the next result (Problem 17). 


Theorem 7.3 The following statements for A € M, are equivalent. 


1. A is positive semidefinite. 

2. A= B*B for some matrix B. 

38. A=C*C for some upper-triangular matrix C. 
4 


. A= D*D for some upper-triangular matrix D with nonnegative 
diagonal entries (Cholesky factorization). 


5. A= E* (e .) E = F*F for somenxn invertible matriz E and 


rxn matric F, where r is the rank of A (Rank factorization). 
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Every nonnegative number has a unique nonnegative square root. 
The analogous result for positive semidefinite matrices also holds as 
we saw in Section 3.2 (Theorem 3.5). We restate the theorem and 
present a proof that works for linear operators. 


Theorem 7.4 For every A> 0, there exists a unique B > 0 so that 
Be= A. 
Furthermore, B can be expressed as a polynomial in A. 


Proof. We may view n-square matrices as linear operators on C”. 
The spectral theorem ensures the existence of orthonormal eigenvec- 
tors U1, U2,..., Un belonging to the eigenvalues Aj, A2,..., An of A, re- 
spectively. Then u1,u2,..., Un form an orthonormal basis for C” and 
A(u;) = Aw, Ay > 0. Define a linear operator B by B(uj) = Wri ui 
for i = 1,2,...,n. It is routine to check that B?(2) = A(x) and 
(B(x), x) > 0 for all vectors x; that is, B? = A and B > 0. 

To show the uniqueness, suppose C’ is also a linear operator such 
that C?(x) = A(x) and (C(x),x) = (x,C(x)) > 0 for all vectors x. 
If v is an eigenvector of C: Cv = pv, then C?v = p?v, ie., py? is 
an eigenvalue of A. Hence, the eigenvalues of C are the nonnegative 
square roots of the eigenvalues of A; that is, /X1, V/\2,.--, V/An- 

Choose orthonormal eigenvectors v1, v2,...,Un corresponding to 
the eigenvalues V1, VA2,.-.,VAn of C, respectively. Then vj, v2, 
..., Un form an orthonormal basis for C”. Let uj = wiv +--+ -+wWnitn, 
i = 1,2,...,n. On one hand, C?(u;) = A(ui) = \yuj = wyyAyv1 + 
+++ WniAiUn, however, C?(uj) = wigA1V1 +++: + WniAnUn. Because 
U1, V9,...+, Un are linearly independent, we have w,;A; = wyA¢ for each 
t. It follows that wy/Ai = w/t, t = 1,2,...,n. Thus, 


Clu;) oe C(wiit1 teeet WniUn) 
= wi v1 +++ + WnsV/An Un 
= wri Avi +++ + WniV i Un 


AS U1, U2,...,Un constitute a basis for C”, we conclude B = C. 
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To see that B is a polynomial of A, let p(x) be a polynomial, 
by interpolation, such that p(\;) = Wi, i = 1,2,...,n (Problem 4, 
Section 5.4). Then it is easy to verify that p(A)=B. 

Such a matrix B is called the square root of A, denoted by A!/2, 

Note that A*A is positive semidefinite for every complex matrix 
A and that the eigenvalues of (A*A)!/? are the singular values of A. 
We further discuss the matrix (A*A)!/? in Chapters 8 and 9. 


Problems 


1. Show that if A is a positive semidefinite matrix, then so are the 
matrices A, AT, adj(A), and A~! if the inverse exists. 


2. Let A be a positive semidefinite matrix. Show that tr A > 0. Equality 
holds if and only if A = 0. 

3. Let A= (aij) € M, and $= oii aij. If A > 0, show that S' > 0. 

4. Let A € M,, be positive semidefinite. Show that (det A)'/” < + tr A. 


5. Find a 2 x 2 nonsymmetric real matrix A such that 27 Ax > 0 for 
every « € R?. What if « € C?? 


6. Let 


4 0 0 
A={000], B 
0 0 0 


II 
eRe 
eee 
ao 


(a) Is B similar to A? 
(b) Is B congruent to A? 


(c) Is B be obtainable from A by elementary operations? 


7. For what real number ¢t is the following n x n matrix with diagonal 
entries 1 and off-diagonal entries t positive semidefinite? 


1 t 
t 1 
8. For what x, y, z € C are the following matrices positive semidefinite? 


1 


Sek 
ao 
ere & 
ran 
nd al ot 
< 
oN 
rer ’& 
a) 
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10. 


11. 


12. 


13: 


14. 


15. 


16. 


17. 


18. 


19. 
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. Let x €C,u,v € C”, a € (0, 1]. Show that (if the powers make sense) 


Feelleias x uu u*v 
( z = |a|2-) 20, vu vty | 2% 
If A, u € C, show that the following matrices are positive semidefinite: 
JAP? +1 At+p JAI? Aw JAI? + |u|? Ate 
A+ mp |wP+1 )? \ Am |wP 7’ r+ fi 2 J 


Show that if A is positive definite, so is a principal submatrix of A. 
Conclude that the diagonal entries of A are all positive. 


Let A = (aij) > 0 be n x n. Show that the matrix with (i, j)-entry 
aig 


is positive definite. What are the diagonal entries of this matrix? 


a 7 


Let A > 0. Show that A can be written as a sum of rank 1 matrices 


k 

¥ 

A= y UiU; 
i=1 


where each wu; is a column vector and k = rank (A). 


Let Ac M,. If x* Ax = 0 for some x ¥ 0, does it follow that A = 0? 
or Ax = 0? What if A is positive semidefinite? 


Let A be an n-square positive semidefinite matrix. Show that 


Amin(A) < @* Ax < Amax(A), for any unit « € C”. 


Show that A > 0 if and only if A = Q*Q for some matrix Q with 
linearly independent rows. What is the size of Q? 


Prove Theorem 7.3. Show further that if A > 0 then the Cholesky 
factorization of A is unique. 


Show that every positive definite matrix is *-congruent to itself; that 
is, if A > 0 then A~! = P* AP for some invertible matrix P. 


Let A be a Hermitian matrix. Show that all the eigenvalues of A lie 
in the interval [a,b] if and only if A—alI > 0 and bf — A > 0 and that 
there exist a > 0 and 8 > 0 such that af+ A>0OandIJ+$A> 0. 
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20 


21. 


22, 


23. 


24. 


25. 
26. 


27. 


28. 


29. 


30. 


31. 


32. 


Let A be a Hermitian matrix. If no eigenvalue of A lies in the interval 
[a, b], show that A? — (a+ b)A + abl is positive definite. 


Find a matrix A € M,, such that all of its principal submatrices of 
order not exceeding n — 1 are positive semidefinite, but A is not. 


Find a Hermitian matrix A such that the leading minors are all non- 
negative, but A is not positive semidefinite. 


Let A € M,,. Show that A > 0 if and only if every leading principal 
submatrix of A (including A itself) is positive semidefinite. 


Let A € M, be a singular Hermitian matrix. If A contains a positive 
definite principal submatrix of order n — 1, show that A > 0. 


Let A € M,, be a positive definite matrix. Prove tr Atr A~! > n?. 


Does every normal matrix have a normal square root? Is it unique? 
How about a general complex matrix? 


Find the square roots for the positive semidefinite matrices 


i a 1 al 
2 
(ii) Gi). Git) ea 


Let A be a Hermitian matrix and k > 0 be an odd number. Show 
that there exists a unique Hermitian matrix B such that A = B*. 
Show further that if AP = PA for some matrix P, then BP = PB. 


Let A be a nonzero n-square matrix. If A is Hermitian and satisfies 


tr A 
(tr A2)1/2 2vn—I, 


show that A > 0. Conversely, if A > 0, show that 
trA Sa 

(tr A2)1/2 — 

Let A be a square complex matrix such that A+ A* > 0. Show that 


A+ A* 
det - 


< | det A]. 


Show that if B commutes with A> 0, then B commutes with A!/?. 
Thus any positive semidefinite matrix commutes with its square root. 


Let A> 0. Show that (A~!)/? = (A1/?)-1 (denoted by A71/?). 
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34. 


35. 


36. 


37. 


38. 


39. 
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Does a contractive matrix really make matrices “smaller”? To be 
precise, if A is a positive definite matrix and C is a contraction, i.e., 
Omax(C) <1, both of size n x n, is it true that A > C* AC? 


Let A > 0 and let B be a principal submatrix of A. Show that B is 
singular (i.e., det B = 0) if and only if the rows (columns) of A that 
contain B are linearly dependent. 


Let A be an n x n positive definite matrix. Show that 


(det ae = min nes) , 
n 


where the minimum is taken over all n-square X > 0 with det X = 1. 


Let A be a Hermitian matrix. Show that neither A nor —A is positive 
semidefinite if and only if at least one of the following holds. 


(a) A has a minor of even order with negative sign. 


(b) A has two minors of odd order with opposite signs. 
Let A and B be n x n complex matrices. Show that 
A*A=B*B 
if and only if B = UA for some unitary U. When does 


A*+A 
2 


= (A* A)/?2 


Let A be a positive definite matrix and r € [0,1]. Show that 
rA+(1—r)I> A" 


and 
(Au,u)” > (A™u,u), for all unit vectors w. 


Let A and C be n x n matrices, where A is positive semidefinite and 
C is contractive. Show that for any real number r, 0 <r <1, 


OACS CAC. 
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7.2 A Pair of Positive Semidefinite Matrices 


Inequality is one of the main topics in modern matrix theory. In this 
section we present some inequalities involving two positive semidefi- 
nite matrices. 

Let A and B be two Hermitian matrices of the same size. If A— B 
is positive semidefinite, we write 


A>B or BK<A. 


It is easy to see that > is a partial ordering, referred to as Lowner 
(partial) ordering, on the set of Hermitian matrices; that is, 


1. A> A for every Hermitian matrix A. 
2. If A> Band B> A, then A= B. 
3. If A> Band B>C, then A>C. 


Obviously, A+ B> BifA>0. That A>O0o X*AX > Oin 
(7.2) of the previous section immediately generalizes as follows. 


A>B © X*AX > X*BX (7.4) 


for every complex matrix X of appropriate size. If A and B are both 
positive semidefinite, then (A‘/?)* = A'/? and thus Al/?BA!/? > 0. 


Theorem 7.5 Let A>0 and B>0 be of the same size. Then 


1. The trace of the product AB is less than or equal to the product 
of the traces tr A and tr B; that is, tr(AB) < tr Atr B. 


2. The eigenvalues of AB are all nonnegative. Furthermore, AB 
is positive semidefinite if and only if AB = BA. 


3. Ifa, B are the largest eigenvalues of A, B, respectively, then 


1 
oe! <AB+ BA < 2a6l. 
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Proof. To show (1), by unitary similarity, with A = U* DU, 
(AB) =t(U" DUB) =tr(DUBU ), 


we may assume that A = diag(\1,...,An). Suppose that 611,..., nn 
are the diagonal entries of B. Then 


tr(AB) = Arbiu +++++Anbrn 
S (Arabs Ag bin tet te Digg) 
= trAtrB. 


For (2), recall that XY and YX have the same eigenvalues if X and 
Y are square matrices of the same size. Thus, AB = A!/?(A!/?B) has 
the same eigenvalues as A!/2BA!/2, which is positive semidefinite. 
AB is not positive semidefinite in general, since it need not be 
Hermitian. If A and B commute, however, then AB is Hermitian, for 


(AB)* = B*A* = BA= AB, 
and thus AB > 0. Conversely, if AB > 0, then it is Hermitian, and 
AB = (AB)* = B* A* = BA. 


To show (3), we assume that A 4 0 and B # 0. Dividing through 
the inequalities by a, we see that the statement is equivalent to its 
case a = 1, 6 = 1. Thus, we need to show —41 <AB+BA< 2]. 
Note that 0 <A < J implies 0 < A? < A <I. It follows that 


0.< (A+B-11) 
= (A+B)?-(A+B)+H4I 
A? + B?+AB+BA-A-B+HI 
< AB+BA+iI, 


that is, AB + BA > —4I. To show AB + BA < 21, we compute 


0< (A-—B)? = A*+ B*- AB- BA<2I-AB-—BA. 


What follows is the main result of this section, which we use to 
reduce many problems involving a pair of positive definite matrices 
to a problem involving two diagonal matrices. 
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Theorem 7.6 Let A and B be n-square positive semidefinite matri- 
ces. Then there exists an invertible matrix P such that 


P*AP and P*BP 


are both diagonal matrices. In addition, if A is nonsingular, then P 
can be chosen so that P* AP =I and P* BP is diagonal. 


Proof. Let rank (A+B) =r and S be a nonsingular matrix so that 


S*(A+B)S = ¢ a 


Conformally partition S* BS as 


S*BS = ( Bi os 


Bo, Bop 
By (7.4), we have S*(A+ B)S > S*BS. This implies 
Bog =0, By=0, Bo =0. 


Now for By,, because By, > 0, there exists an r x r unitary matrix 
T such that 7* BT is diagonal. Put 


T 0 
Dae la a.) 


Then P*BP and P* AP = P*(A+ B)P — P*BP are both diagonal. 

If A is invertible, we write A = C*C for some matrix C’. Consider 
matrix (C~!)*BC-!. Since it is positive semidefinite, we have a 
unitary matrix U such that 


(CEC = Unt", 


where D is a diagonal matrix with nonnegative diagonal entries. 

Let P=C7!U. Then P*AP =I and P*BP=D. 0 

Many results can be derived by reduction of positive semidefinite 
matrices A and B to diagonal matrices, or further to nonnegative 
numbers, to which some elementary inequalities may apply. The 
following two are immediate from the previous theorem by writing 
A= P*D,P and B = P*D2P, where P is an invertible matrix, and 
D, and D2 are diagonal matrices with nonnegative entries. 
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Theorem 7.7 Let A>0, B>0 be of the same order (> 1). Then 
det(A + B) > det A+detB (7.5) 


with equality if and only if A+ B is singular or A=0 or B =0, and 
1 
AaB) 2s (Ae) (7.6) 
if A and B are nonsingular, with equality if and only if A = B. 


Theorem 7.8 If A> B>0, then 


1. rank (A) > rank (B), 
2. det A > det B, and 
% fos A if A and B are nonsingular. 


Every positive semidefinite matrix has a positive semidefinite 
square root. The square root is a matrix monotone function for 
positive semidefinite matrices in the sense that the Lowner partial 
ordering is preserved when taking the square root. 


Theorem 7.9 Let A and B be positive semidefinite matrices. Then 
A>B = Ai? > BY? 


Proof 1. It may be assumed that A is positive definite by continuity 
(Problem 5). Let C = A‘/?, D = B/?, and E =C — D. We have 
to establish E > 0. For this purpose, it is sufficient to show that the 
eigenvalues of FE are all nonnegative. Notice that 


020 =P aC (C28 =CrbC-£. 


It follows that CE + EC > 0, for E is Hermitian and E? > 0. 
On the other hand, let A be an eigenvalue of EF and let u be an 
eigenvector corresponding to A. Then J is real and by (7.1), 


0<u (CE + EC)u = 2X(u*Cu). 


Since C > 0, we have \ > 0. Hence FE > 0; namely, C > D. 
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Proof 2. First notice that A!/2 — B!/? is Hermitian. We show that 
the eigenvalues are all nonnegative. Let (A!/?— B'/?)a = da, x 40. 
Then B!/2x7 = A'/2x — Ax. By the Cauchy—Schwarz inequality, 


|a*y| < (a*x)'/? (y*y)/?, for all z, ye C”. 

Thus, we have 
a*Ac = (2*Ax)/?(a* Ax)'/? > (2* Ax)/?(a* Br)/? > 2* AV? BY 2 
a* AV?2( AYA — Ax) = x* Ax — rz* Al/2 x, 
It follows that Ax* A!/2x > 0 for all x in C, sor> 0. 


Proof 3. If A is positive definite, then by using (7.4) and by multi- 
plying both sides of B < A by A7!/? = (A7!/)*, we obtain 


ABA <7, 


IV 


rewritten as 


1/2 4—-1/2)*( pl/2 4-1/2) < 
(BY*A (BCA es 


which gives 
Cine (BPA) < it, 


where Omax Means the largest singular value. Thus, by Problem 10, 
Net At BA | = pe ae | < tale A ma dy 
where A~!/4 is the square root of A~!/?, and, by Problem 11, 
0< AM BV24-1/4 < 7. 
Multiplying both sides by A!/*, the square root of A!/?, we see that 
BY2 < Au, 
The case for singular A follows from a continuity argument. 


Theorem 7.10 Let A and B be positive semidefinite matrices. Then 
A>B => AT>B, 0<r<l. 


This result, due to Lowner and Heinz, can be shown in a similar 
way as the above proof 3. That is, one can prove that if the inequality 
holds for s,t € [0,1], then it holds for (s + t)/2, concluding that the 
set of numbers in [0,1] for which the inequality holds is convex. 
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Problems 
1. 
2. 
3. 


14. 


15. 
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Show that A> B => trA> tr B. When does equality occur? 
Give an example where A > 0 and B > 0 but AB is not Hermitian. 


Show by example that A > B >0 = A? > B? is not true in general. 
But AB = BA and A> B>O imply A* > B*,k=1,2,.... 


. Referring to Theorem 7.6, give an example showing that the matrix 


P is not unique in general. Show that matrix P can be chosen to be 
unitary if and only if AB = BA. 


. Show Theorem 7.9 for the singular case by a continuity argument; 


that is, lim,_,9+(A+ eI)/? = Al/? for A> 0. 


. Complete the proof of Theorem 7.10. 


. If A is an n X n complex matrix such that x«* Ax > x*a for every 


x € C”, show that A is nonsingular and that A> J > A! >0. 


. Let A,B € M,. Show that A > B (ie., A— B > 0) if and only if 


X*AX > X*BX for every n xX m matrix X with rank (X) =m. 


. Let A be a nonsingular Hermitian matrix. Show that A > A! if 


and only if all eigenvalues of A lie in [—1,0) U [1, 00). 


. Let AE M,. If A > 0, show that x* Av < Amax(A) for all unit x and 


that |A(A)| < Omax(A) for every eigenvalue A(A) of A. 


. Let A = A*. Show that 0 < A < I © every eigenvalue A(A) € [0, 1]. 
. Let A> 0. Show that Amax(A)Z > A and Amax(A) > max{a;;}. 


. Let A and B be n-square positive semidefinite matrices. 


(a) Show that A> B>0 © Amax(A71B) < 1. 

(b) Show that A> B>0O => det A> det B. 

(c) Give an example that A > B > 0, det A = det B, but AF B. 
Let A > 0 and B > 0 be of the same size. As is known, the eigenvalues 
of positive semidefinite matrices are the same as the singular values, 


and the eigenvalues of AB are nonnegative. Are the eigenvalues of 
AB in this case necessarily equal to the singular values of 4B? 


Show that the eigenvalues of the product of three positive semidefinite 
matrices are not necessarily nonnegative by the example 


(Eta i) e-(2 4) 
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16. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


25. 


Let A > 0 and B = B* (not necessarily positive semidefinite) be of 
the same size. Show that there exists a nonsingular matrix P such 
that P* AP = I, P* BP is diagonal, and the diagonal entries of P* BP 
are the eigenvalues of A~1B. Show by example that the assertion is 
not true in general if A > 0 is replaced with A > 0. 


Let A, B,C be three n-square positive semidefinite matrices. Give an 
example showing that there does not necessarily exist an invertible 
matrix P such that P* AP, P* BP, P*CP are all diagonal. 


Let A, B be mxn matrices. Show that for any m xm matrix X > 0 
| tr(A* B)|? < tr(A* XA) tr(B* XB). 
Let A = (aij) be an n x n positive semidefinite matrix. Show that 
(a) Chi ak < tra? < (Th ai)’, 
(b) (Sika aie)? < tra? < Dh ail”, 
(6) (Shien) = Shag’ <tr A ASO. 


Let A>0 and B > 0 be of the same size. Show that 


tr(Al/?BY/2) < (tr A)!/?(tr B)!/? 


and si 
(tr(A + B)) ? < (tr A)? + (tr B)Y/?. 


Let A > 0 and B > 0 be of the same size. Show that 
tr((A7' — B“')(A—B)) <0. 
Let A>0 and B>C > 0 be all of the same size. Show that 
tr((A+ B)~'B) >tr((A+C)'C). 


Construct an example that A >0,B >0 but AB+ BA # 0. Explain 
why A‘/?BA1/? < $(AB + BA) is not true in general for A, B > 0. 


Show that for Hermitian matrices A and B of the same size, 


A? +B? > AB+BA. 
Let A = Pos) and B= ey Find a condition on a and £ so that 


A?+AB+ BA 0. 
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26. 


27. 


28. 


29. 


30. 


bl. 


32. 


33. 


34. 
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Let A and B be n-square Hermitian matrices. Show that 
A>0,AB+BA>0 > B20 


and 
A>0,AB+BA>0 > B>0. 


Show that A > 0 in fact can be replaced by the weaker condition 
A > 0 with positive diagonal entries. Show by example that the 
assertions do not hold in general if A > 0 is replaced by A > 0. 


Let A and B be n-square real symmetric invertible matrices. Show 
that there exists a real n-square invertible matrix P such that P’ AP 
and P7 BP are both diagonal if and only if all the roots of p(x) = 
det(a#A — B) and q(x) = det(#B — A) are real. 


Let A = a and B = ie Hae Does there exist a real matrix P 


such that P™AP and P’ BP are both diagonal? Does there exist a 
complex matrix P such that P*AP and P? BP are both diagonal? 


Let A > 0 and B > 0 be of the same size. Show that 
(A+B)*<A™" (or B-). 
Let A> 0 and B > 0 be of the same size. Show that 
A —(A+B)"'- (A+B) B(A+ B)' 50, 
Let A> 0 and B > 0 be of the same size. Prove or disprove that 
(BAB)® = B°A°B®, 
1 


> 9? 
Let A > 0 and B > 0 be of the same size. Show that. 


where a = 2 or —1 if A and B are nonsingular. 


BAB<I = BY?ABY <I. 
Let A> 0 and B > 0 be of the same size. Consider the inequalities: 
(a) A> B (b) A? > B* (ce) BAUR > B* (a) (BAB) SB. 


Show that (a)#(b). However, (b)=(c)=(d). 
Prove Theorem 7.10. Show by example that A > B>0+4 A? > B?. 
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35. Let A >0 and B > 0 be of order n, where n > 1. Show that 
(det(A + B))'/” > (det A)!” + (det B)/”, 


Equality occurs if and only if A = 0 or B = 0 or A+ B is singular or 
B=<aA for some a > 0. Conclude that 


det(A + B) > det A + det B 
with equality if and only if A= 0 or B= 0 or A+ B is singular, and 
det(A + B) > det A 


with equality if and only if B = 0 or A+ B is singular. 
36. Let A, B, and C be n x n positive semidefinite matrices. Show that 


det(A + B) + det(A + C) < det A+det(A+B+C). 


[Hint: Use elementary symmetric functions and compound matrices. 


37. Let A > 0 and B > 0 be of the same size. Show that for any X, p € C, 
| det(AA + “B)| < det(|A|A + |u| B). 
38. Show by example that A > B > 0 does not imply 
Al/2_ Biz < a= By '/2, 


39. Let A >0 and B > 0 be of the same size. Show that 


trB = det B 
trA ~ det A’ 


40. Let A > 0 and B > 0, both n x n. Show that 
tr(ABA) < tr(A)Amax(AB) 


nd 
° tr(A + ABA) < (n+ tr(AB))Amax(A). 


41. Show that A >0 and B > C > 0 imply the matrix inequality 
ARAYA S AACA hat not BYCAR? > Or AC. 
42. Let A and B be n-square complex matrices. Prove or disprove 


A*A<B*B = A*CA<B*CB, forC>0. 
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43. Let A and B be n-square complex matrices. Prove or disprove 


ASA< BSB = AA* < BB". 


44. Let A > 0 and B > 0 be of the same size. For any ¢ € [0,1] with 


45. 


A6. 


AT. 


A8. 


A9. 


¢ = 1 —t, assuming that the involved inverses exist, show that 


ri = -1 = 
aD 24 es es 
[b) GALE)? tA PB ge(4 Pe sa 
(c) ((A+tB)/? > tA? +tB/?. So (AtB)1/2 > siege 

re rg 2 
(d) (tA + EB) < tA? + £B?. So (452)? < AGE 


3 
Show, however, that (442)3 < aoe is not true in general. 


(e) det(tA+#B) > (det A)*(det B)'. So det(4t2) > Vdet A det B. 


2 


Let A > 0 and B > 0 be matrices of the same size with eigenvalues 
contained in the closed interval [m, M4]. Show that 


oeuy4+B) < vie mar ee 
~ Ae ( AW BAe At 
1 
< (A+ 8B). 


Let A, B, and C be Hermitian matrices of the same size. If A > B 
and if C commutes with both AB and A+ B, show that C commutes 
with A and B. What if the condition A > B is removed? 


Show that the product AB of a positive definite matrix A and a 
Hermitian matrix B is diagonalizable. What if A is singular? 


Let A be an n-square complex matrix. If A+ A* > 0, show that 
every eigenvalue of A has a positive real part. Use this fact to show 
that X > 0 if X is a Hermitian matrix satisfying for some Y > 0 


XY+YX >0. 


Let A and B be nxn Hermitian matrices. If A* + B* = 21,, for some 
positive integer k, show that A+B < 2I,,. [Hint: Consider odd k 
first. For the even case, show that A?? + B?P < 27 > AP + BP < 271] 
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7.3 Partitioned Positive Semidefinite Matrices 


In this section we present the Fischer and Hadamard determinant in- 
equalities and the matrix inequalities involving principal submatrices 
of positive semidefinite matrices. 

Let A be a square complex matrix partitioned as 


Ai Aig ) 
A= , Tl 
( An Aso oo 
where Aj; is a square submatrix of A. If Aj, is nonsingular, we have 


z 0 I -AyAe2\ ( Au 0 
ey are I J={ 0 An J’ ce 


where 


Ay, = Ago — An Ay Ai2 


is called the Schur complement of Ay, in A. By taking determinants, 
det A = det Ay; det Ay). 
If A is a positive definite matrix, then A, is nonsingular and 
Ago > Ay > 0. 
The Fischer determinant inequality follows, for det Az2 > det Ay : 


Theorem 7.11 (Fischer Inequality) Jf A is a positive semidefi- 
nite matrix partitioned as in (7.7), then 


det A < det Aj det Ago 
with equality if and only if both sides vanish or Ajg = 0. Also 
| det Ayp|* < det Aj; det Ago 


if the blocks A11, A12, A21, and A22 are square matrices of the same size. 
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Proof. The nonsingular case follows from the earlier discussion. For 
the singular case, one may replace A with A+ el, € > 0, to obtain 
the desired inequality by a continuity argument. 

If equality holds and both Aj; and Ag2 are nonsingular, then 


det Aas. = det Aoo = det (A22 = Ao Aj] A12). 


Thus, Ao Ay A12 = 0 or Ajg = 0 by Theorem 7.7 and Problem 9. 
For the second inequality, notice that Ago > Ao Ay} Ate > 0. 
The assertion follows at once by taking the determinants. Il 


An induction on the size of the matrices gives the following result. 


Theorem 7.12 (Hadamard Inequality) Let A be a positive semidef- 
inite matrix with diagonal entries a1, @22, ..., Ann. Then 


det A < Q11422°*: Ann. 
Equality holds if and only if some ay = 0 or A is diagonal. 


A direct proof goes as follows. Assume that each a;,; > 0 and 
let D= diag(a;//”, _ Onn?) Put B = DAD. Then B is a posi- 
tive semidefinite matrix with diagonal entries all equal to 1. By the 


arithmetic mean—geometric mean inequality, we have 
n n 1/n 
n=trB=)_),(B) > n( I] s@)) = n(det B)/”. 
i=l i=l 
This implies det B < 1. Thus, 
nm n 
det A = det(D~'BD~') = ] [a det B < ][ex:. 
i=l i=l 


Equality occurs if and only if the eigenvalues of B are identical and 
det B = 1; that is, B is the identity and A is diagonal. 
It follows that for any complex matrix A of size m x n, 


det(A*A) < [] S— lai/?. (7.9) 


j=1i=1 


a 
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An interesting application of the Hadamard inequality is to show 
that if A= B+iC > 0, where B and C are real matrices, then 


det A < det B 


by passing the diagonal entries of A to B through a real orthogonal 
diagonalization of the real matrix B (Problem 31). 

We now turn our attention to the inequalities involving principal 
submatrices of positive semidefinite matrices. 

Let A be an n-square positive semidefinite matrix. We denote in 
this section by [A],,, or simply [A], the k x & principal submatrix of A 
indexed by a sequence w = {i1,...,ix}, where] <i1 <--> < ig <n. 
We are interested in comparing f([A]) and [f(A)], where f(z) is the 
elementary function x”, 2/2, 2~/?, or 27!. 


Theorem 7.13 Let A > 0 and let [A] be a principal submatrix of 
the matrix A. Then, assuming that the inverses involved exist, 


Se, Aiea Aaa bin 


Proof. We may assume that [A] = Aj; as in (7.7). Otherwise one can 
carry out a similar argument for P’AP, where P is a permutation 
matrix so that [A] is in the upper-left corner. 

Partition A?, A!/?, and A~! conformally to A in (7.7) as 


.f BF p_{ P Q 4. f & ¥ 
#aCe a) aCe a) (sz) 


Upon computation of A? using (7.7), we have the first inequality: 
[A] =h= oe + AyoAo1 > A = [A]?. 


This yields [A]!/? > [A‘/?] by replacing A with A!/? and then taking 
the square root. The third inequality, [A~!/?] > [A]~!/?, follows from 
an application of the last inequality to the second. 

We are left to show [A~!] > [A]~!. By Theorem 2.4, we have 


Sei 
[A =X =A + Aq AvdAn AnAq! > Aq = [Al 0 
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The inequalities in Theorem 7.13 may be unified in the form 
(f(A) > FAD, where f(e) = 22, —2¥/?, 2-¥2, or a7}, 


Notice that if A is an n-square positive definite matrix, then for 
any n X m matrix B, 


A B 


Note also that B*A~!B is the smallest matrix to make the block 
matrix positive semidefinite in the Lowner partial ordering sense. 


Theorem 7.14 Let A € M,, be a positive definite matrix and let B 


be ann x m matrix. Then for any positive semidefinite X € Mm, 


A B = 
> > B* : 
e24e. ~ X>BtA 'B 


Proof. It is sufficient to notice the matrix identity 


i. 0 A B iy -A Ss 
=B'A-! i, BR? xX 0 i 


_[ A 0 “ 
~\ 0 X-BtAlB )° 


Note that, by Theorem 2.4, [A~!] = (Ay, — Ay2A55 Ani) !. Thus 
for any positive semidefinite matrix A partitioned as in (7.7), 


_f {[Atyt 0o\_ Ai2A55-An Az = 
A ( a Co a ee. CH 


Since a principal submatrix of a positive definite matrix is also 
positive definite, we have, for any n x n matrices A, B, and C, 


(a c)e° > Cay ia )2 


Using the partitioned matrix in (7.10), we obtain the following result. 
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Theorem 7.15 Let A be ann xn positive definite matrix. Then for 
anyn xn matric B, with [-] standing for a principal submatriz, 


[B*][A]"*[B] < [B*A*B]. 


We end the section by presenting a result on partitioned posi- 
tive semidefinite matrices. It states that for a partitioned positive 
semidefinite matrix in which each block is square, the resulting ma- 
trix of taking determinant of each block is also positive semidefinite. 
An analogous result for trace is given in Section 7.5. 


Theorem 7.16 Let A be aknxkn positive semidefinite matrix par- 
titioned as A = (Aj;), where each Ajj is ann xn matriz, 1 <i,j <k. 
Then the k x k matriz D = (det Ajj) is positive semidefinite. 


Proof. By Theorem 7.3, we write A = R*R, where Ris knxkn. Par- 
tition R = (Rj, Ro,...,R,), where R; isknxn,i=1,2,...,k. Then 
Ajj = R;R;. Applying the Binet-Cauchy formula (Theorem 4.9) to 
det Aj; = det(R}R;) with a = {1,2,...,n}, we have 


det Ajj = det(R}R;) = 5— det Rj [ap] det R,[Slal, 
B 


where 6 = {j1,.--,; Jn}, 1 <j < +++ < jn < kn. It follows that 


D = (det Ajj) = (do act Rilo 8] det R,(Sla)) 
B 


= > (dot Rj[a\5] det R;(6I0]) 
B 

= J (det(Ril5]a])* det R;[5Ia]) 
B 


= S°T3Tp > 0, 
B 


where T3 is the row vector (det Ri[Gla],...,det R,[Gla]). lf 
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Problems 


1. Show that (aa) > 0 for any X > A> 0. 


2. Show that if A> B > 0, then ‘ee =. 


3. Show that X must be the zero matrix if e me) > 0. 


4. Show that : cn) > 0 for any matrix X. 


5. Show that (pide P) 20 and (sie gle) 2 Oif A, B > 0. 


6. Let A>0 and B > 0 be of the same size. Show that 


Grn A 


—1 —l)-1 
A ee X>-(A“+B")™. 


7. Refer to Theorem 7.11. Show the reversal Fischer inequality 


det Ay; det Ago < det A. 


8. Show the Hadamard determinant inequality by Theorem 7.3(4). 


9. Let A be ann xm complex matrix and B be an nx n positive definite 
matrix. If A*BA = 0, show that A = 0. Does the assertion hold if 
B is a nonzero positive definite or general nonsingular matrix? 


10. When does equality in (7.9) occur? 


11. Show that a square complex matrix A is unitary if and only if each 
row (column) vector of A has length 1 and | det A] = 1. 


12. Show that the following matrices are positive semidefinite. 


(a) Ca i for any m x n matrix A. 


ay for any positive definite matrix A. 


a oe) for any A and B of the same size. 
: :) for any A and B of the same size. 
a 4) for any \ € (0,-+00) and A > 0. 


‘ a for any matrix A. 
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13. Let A ben xn and B ben x m. If A is nonsingular, verify that 


(# wate )=(0 #)(4 a) (0 5) 


14. Let A > 0 and B > 0 be of the same size. Show that, with [X] 
standing for the corresponding principal submatrices of X, 


(a) [((A+ B)] < [A] +[B“4], 
(b) [A+ B]“* < [A}-* + [B™], 
(c) [A ue 4 (Ae), 
(d) [A]-* + [B]-? < [477] + [B~4). 


15. Show that for any square complex matrix A, with [X] standing for 
the corresponding principal submatrices of X, 


[A* A] > [A*] [A]. 
16. Let A and B be square complex matrices of the same size. With [X] 
standing for the corresponding principal submatrices of X, show that 
[AB][B* A*] < [ABB* A*] 


and 


[A][B][B"][A"] < [A][BB"][A"]. 
Show by example that the inequalities below do not hold in general: 
[A][BI[B*][4"] < [ABB* A". 
[AB][B*A*] < [A][BB*][4"]. 
[A][BB"][A*] < [ABB* A‘). 
Conclude that the following inequality does not hold in general. 
[B*][A][B] < [B* AB], where A> 0. 


17. Let A be a positive semidefinite matrix. Show by the given A that 


[A*] > [A]* 
is not true in general, where [ - ] stands for a principal submatrix and 
1 0 1 
A=|{0 0 1 
1 11 


What is wrong with the proof, using [A?] > [A]? in Theorem 7.13, 
[A“] = [(47)"] > [4 > ([A)?)” = [Al*? 
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18. Let A>0 and B > 0 be of the same size. Show that 
A=27RBIXS & B= eA Ke SH, 


19. Let A>0 and B> 0 be of the same size. Show that 


trB tr B 


A'S) 2 5 (A) aA 


and 
trA 


So ee 
—~ 14+ Amax(AB) 
20. Let AE M,, Ce M,,, and B be n x m. Show that 


tr ((I + AB)~* A) 


A B be 
€: ae. => tr(B*B) <trAtrCd. 


Does 
det(B* B) < det Adet C? 


21. Let A, B, and C be n-square matrices. If AB = BA, show that 


A B 1/207 41/2 * 
& a) 20 => A’CA’ > BB. 


22. Let A, B, and C be n-square matrices. Show that 


A B * 
& G20 => A+t(B+B*)4+C>0. 


23. Let A, B, and C be n-square matrices. Show that 


& pe > s05 a ee 


where }(X) denotes the sum of all entries of matrix X. 


24. Give an example of square matrices A, B, C of the same size for which 


A B A B* 
(AB) 20m (4B) x0 


25. Give an example of square matrices A, B, C of the same size for which 


2 2 
& G20 but a. Go) 20 
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26. Let A € M,, be a positive definite matrix. Show that for any B € M,, 
A+B+58*+B*A'B> 0. 


In particular, 
I+ B+B*+B*B> 0. 


27. For any nonzero vectors 71,%2,...,2, in an inner product space, let 
G(a1,2,...,2n) = ((@j, v:)). 
Such a matrix is called the Gram matriz of x1, 22,...,%p. Show that 
G(a1, 02, eae ; Ba) = 0 
and that 
n 
det G(a1,...,2n) < I]: Xi). 
i=1 
Equality holds if and only if the vectors are orthogonal. Moreover, 


det G(a1,...,2n) < det G(a1,...,%m) det G(am41,.--,2n)- 


28. Let u1,U2,...,Un € C™ ben column vectors of m components. Form 
four matrices as follows by these vectors. Show that the first three 
matrices are positive semidefinite, and the last one is not in general. 


UjU1 UpUZ ... UpUn UU Usu, ... URDU 
UzxU1 UZUQ «..  UQUn UjUg UsUQ ... URUe 
2 9 
* * * * * * 
UrUL USU2Q ...  UZUn UjUn UsUn  .-.  UUn 
UjUp U1UZ «UU, UU, UgQUT we. Unt 
UgUy UgUs ... UgUy, UjU, UQUZ ... UnU5 
o] 
* * * * * * 
UnUp Uns... Une, UU, U2Us, UnUy, 


29. Use the Hadamard inequality to show the Fischer inequality. [Hint: 
If B is a principal submatrix of A, where A is Hermitian, then there 
exists a unitary matrix U such that U* BU is diagonal.| 


30. Show Theorem 7.16 by a compound matrix. [Hint: The k x k matrix 
(det A;;) is a principal submatrix of the nth compound matrix of A.] 
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32. 


33. 


34. 


35. 
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Let A= B+iC > 0, where B and C are real matrices. Show that 


(a) B is positive semidefinite. 

(b) C is skew-symmetric. 

(c) det B > det A; when does equality occur? 
(d) rank (B) > max{rank (A), rank (C)}. 


Find a positive semidefinite matrix A partitioned as 
Au Ais 
A —d 
( Az, Aga 


det A = det Ai det Aoo but Ajo = Ady x 0. 


such that 


Show that for any complex matrices A and B of the same size, 


det(A*A) det(A*B) tr(A*A) tr(A*B) 
( det(B* A) ‘et(B*B) ) 2 0, ( tr(B*A) tr(B*B) ) 2 0, 
on AtA A*B 


BtA Brp |—” 


Let A be an n x n positive semidefinite matrix partitioned as 


va Au Ais , where Aj; and Ago are square. 
Ag, Ao 


Show that, by writing A = X*X, where X = (S,T) for some S, T, 
C(A1z) ©C(Ar1), C(Aei) C C(A2a), 
and 
R(Aiz) © R(Az2), R(Aa1) C R(A11). 

Further show that 

rank (Ay1, Aj2) =rank(A;,), rank (A21, A22) = rank (Ag2). 
Derive that Aj2 = A,,P and Ag; = QA, for some P and Q. Thus 

max{rank (Aj2), rank (Ag,)} < min{rank (Aj), rank (Ag2)}. 
Let [X] stand for a principal submatrix of X. If A > 0, show that 

rank [A*] = rank [A]* = rank[A], k=1,2,.... 

[Hint: rank [AB] < rank [A] and rank [A?] = rank [A] for A, B > 01] 
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7.4 Schur Complements and Determinant Inequalities 


Making use of Schur complements (or type III elementary operations 
for partitioned matrices) has appeared to be an important technique 
in many matrix problems and applications in statistics. In this sec- 
tion, we are concerned with matrix and determinant (in)equalities 
involving matrices in the forms J + A*A, J — A*A, and I — A*B. 
As defined in the previous section, the Schur complement of the 
nonsingular principal submatrix Aj; in the partitioned matrix 


Ai. Ate ) 
A= 
( An Age 


Ay, = Ag: — An Ay] Ai. 


Note that if A is positive semidefinite and if A, is nonsingular, then 
Ago > Ay, > 0. 


Theorem 7.17 Let A> 0 be partitioned as above. Then 


—-1 
At= ( es a (7.12) 
L1 


where , : 
X=—-AyAwAn =—Are Ai2Az5 


and 


eee a 
Y=—AyyAnAn =—-An AiAjzy- 


The proof of this theorem follows from Theorem 2.4 immediately. 
The inverse form (7.12) of A in terms of Schur complements is 
very useful. We demonstrate an application of it to obtain some 
determinant inequalities and present the Hua inequality at the end. 
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Theorem 7.18 For any n-square complex matrices A and B, 
det(I + AA*) det(I + B*B) > | det(A+ B)|? + | det(I — AB*)|? 
with equality if and only ifn =1 or A+ B=0 or AB* =I. 


The determinant inequality proceeds from the following key ma- 
trix identity, for which we present two proofs. 


I+AA* = (A+B)(I+B*B)'(A+B)* 
+ (I — AB*)(I + BB*)-1(1 — AB*)*. (7.13) 


Note that the left-hand side of (7.13) is independent of B. 


Proof 1 for the identity (7.13). Use Schur complements. Let 


ye I+B*B B*+ A* 
a A+B I+AA* 


Then the Schur complement of J + B*B in X is 
(TiLAAY C4 Fees Ate): (7.14) 
On the other hand, we write 
I B I A* 
x=(4 7 )(a 7): 
Then by using (7.12), if J — AB* is invertible (then so is J — B* A), 
ga (ES PE Be A 
Bod A I 
_ ( (I[—A*B)-1 —(I— A*B)71A* ) 


(I-BA*)*B (I-BA*)"! 


(I—B*tA)-+ —B*(I-AB*)" 
ea GAB + ) 


Thus, we have the lower-right corner of X~!, after multiplying out 
the right-hand side and then taking inverses, 


(TAB Ce Be AB. (7.15) 
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Equating (7.14) and (7.15), by (7.12), results in (7.13). The singular 
case of I — AB* follows from a continuity argument by replacing A 
with €A such that J — ¢eAB* is invertible and by letting « > 1. 


Proof 2 for the identity (7.13). A direct proof by showing that 
(I + AA*) — (I — AB*)(I + BB*) (I — AB*)* 
equals 
(A+ B)(I + B*B)~1(A+ B)*. 


Noticing that 
B+ B*B) = (1+ BB*)B, 


we have, by multiplying the inverses, 
(T+ BB") B= Bi +B" RB) (7.16) 
and, by taking the conjugate transpose, 
B*(I + BB*)"* = (I + B*B)-1B’. (7.17) 
Furthermore, the identity 
I=(I+ B*B)(I+ B*B) 
yields 
I — B*B(I + B*B)- = (1+ B*B)" (7.18) 
and, by switching B and B*, 
1748) =3B°0 & ep). (7.19) 
Upon computation, we have 
(I + AA*) — (I — AB*)(I + BB*) ‘(I — AB*)* 
= AA*— AB*(I + BB*)"BA* + AB*(I +. BB") 
+ (I+ BB*)'BA*+I-—(I+ BB*)"! (by expansion) 
= AA AM ert eee Are ee 
+ BUI + B*B)~'A* + BB*(I + BB*)™ (by 7.16, 7.17, 7.19) 
= A(I+B*B)-‘A* + A(T + B*B)-"B* 
+ B(I+ B*B)-'A* + B(I + B*B)'B* (by 7.17, 7.18) 
= (A+B)(I+ B*B)'(A+B)* | (by factoring). 
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The identity (7.13) thus follows. 

Now we are ready to prove the determinant inequality. Recall 
that (Theorem 7.7 and Problem 35 of Section 7.2) for any positive 
semidefinite matrices X and Y of the same size (more than 1), 


det(X + Y) > det X + det Y 


with equality if and only if X + Y is singular or X = 0 or Y = 0. 
Applying this to (7.13) and noticing that J + AA* is never singu- 
lar, we have, when A and B are square matrices of the same size, 


| det(I — AB*)|? + | det(A + B)|? < det(I + AA*) det(I + B*B); 


equality holds if and only ifn =1 or A+ B=O0orAB*=I/. 


As consequences of (7.13), we have the Lowner partial orderings 
I+ AA* > (A+ B)(I+ B*B)1(A+ B)* >0 


and 
I+ AA* > (I — AB*)(I + BB*) 1(I — AB*)* > 0, 


and thus the determinant inequalities 
| det(A + B)|? < det(I + AA*) det(I + B*B) 
and 
| det(I — AB*)|? < det(I + AA*) det(I + B*B). (7.20) 


Using similar ideas, one can derive the Hua determinant inequal- 
ity (Problem 15) for contractive matrices. Recall that a matrix X is 
contractive if J— X*X > 0, and strictly contractive if [— X*X > 0. 
It is readily seen that the product of contractive matrices is a con- 
tractive matrix: 


I>A*A => B*B> B*(A*A)B => I> (AB)*(AB). 
To show Hua’s result, one shows that 


(f— B* A(T —A* A)" — AB) 
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is equal to 
(B= A)*(I — AA*)-*(B = A) + (1 — B’B); 


both expressions are positive semidefinite when A and B are strict 
contractions. This implies the matrix inequality 


I-— B*B < (I — B*A)(I — A*A)-1(1 — A*B). (7.21) 


Theorem 7.19 (Hua Determinant Inequality) Let A and B be 
m xn contractive matrices. Then 


| det(I — A*B)|? > det(I — A* A) det(I — B*B). (7.22) 
Equality holds if and only if A= B. 


Hua’s determinant inequality is a reversal of (7.20) under the 
condition that A and B be contractive matrices of the same size. 
Note that the matrix inequality (7.21) is equivalent to saying 


(I— A*A)-! (I- B*A)“! 
( (I— A*B)"! (I- B*B)} ) 2 0. 


Question: Is the above block matrix the same as 


(I—A*A)-! (I- A*B)-1 \, 
(aa ope) 


If not, is the latter block matrix positive semidefinite? Note that in 
general (2 — eal (Z ) > 0 (see Problem 24, Section 7.3). 


Problems 


1. Let Aj, be a principal submatrix of a square matrix A. Show that 


A>0O Ss Ay,>O0 and Aaa =. 


2. Show by writing A = (A~!)~! that for any principal submatrix A,,, 
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. Let Ay, and By, be the corresponding principal submatrices of the 


nxn positive semidefinite matrices A and B, respectively. Show that 


ae es 
4(Ai. + Biu)7* < Aon + Boo . 


. Let A and B be positive definite. Show that A< B => An < Bus 


. Let A > 0. Show that for any matrices X and Y of appropriate sizes, 


X*AX X*AY $0 
Y*AX Y*AY ]/ ~~ 


. Use the block matrix ( a ?) to show the matrix identities 


(=A Ay HT4 a= AAy A 


and 
F+AA*(I — AA*)-+ = (7 — AA*)“. 


. Show that if P is the elementary matrix of a type III operation on 


_{ 4u Ar), . (iI x 
a=( 42 a that is, P=(4 ay 


then the Schur complements of A,; in A and in P? AP are the same. 


. Let A, B,C, and D be nxn nonsingular complex matrices. Show that 


Av} Bo} 
| c7l D7! 


_ (-1)” AC 
~ det(ACBD)| B D |’ 


. Let A, C be mx n matrices, and B, D be m x p matrices. Show that 


( AA*+ BB* AC*+ BD* ) 
CA*+ DB* CC*+DD* 
Show that for matrices A, B, C, and D of appropriate sizes, 

| det(AC + BD)|? < det(AA* + BB*) det(C*C + D*D). 
In particular, for any two square matrices X and Y of the same size, 


| det(X + Y)|? < det(I + X X*) det(I + Y*Y) 


and 
| det(I + XY)|? < det(I+ XX*) det(I+Y*Y). 
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11. 


12. 


13. 


14. 


15. 
16. 


Prove or disprove that for any n-square complex matrices A and B, 
det(A* A + B* B) = det(A*A + BB*) 


or 


det(A* A + B* B) = det(AA* + BB*). 
Let A and B be m x n matrices. Show that for any n x n matrix X, 
AA*+BB* = (B+AX)(I+X*X) "(B+ AX)" 
+ (A-— BX*)(I+ XX*)"'(A— BX*)*. 


Denote H(X) = $(X* +X) for a square matrix X. For any n-square 
matrices A and B, explain why (A — B)*(A— B) > 0. Show that 


* 1 * \ * 
H(I— A*B) > 5((I A*A) + (I B*B)). 


Let A and B be square contractive matrices of the same size. Derive 
det(I — A* A) det(I — B*B) + | det(A* — B*)|? < | det(I — A*B)|? 


by applying Theorem 7.17 to the block matrix 


tes aa ai II 
I-B*A I-B*B}) \I B -A -B }° 
Show that the determinant of the block matrix on the left-hand side 
is (—1)”|det(A — B)|?. As a consequence of the inequality, 
| det(I — A*B)|? > det(I — A* A) det(I — B*B) 
with equality if and only if A = B when A, B are strict contractions. 
Show Theorem 7.19 by the method in the second proof of (7.13). 
Let A, B, C, and D be square matrices of the same size. Show that 
I+ D*C — (I+ D*B)(I + A*B)"1(1+ A*C) 
= (D — A)*(I + BA*)“'(C — B) 
if the inverses involved exist, by considering the block matrix 


I+A*B I+A*C 
I+D*B I+D*C ]}- 
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7.5 The Kronecker and Hadamard Products 
of Positive Semidefinite Matrices 


The Kronecker product and Hadamard product were introduced in 
Chapter 4 and basic properties were presented there. In this section 
we are interested in the matrix inequalities of the Kronecker and 
Hadamard products of positive semidefinite matrices. 


Theorem 7.20 Let A>0 and B>0. Then AQB>O. 


Proof. Let A = U*CU and B = V* DV, where C and D are diagonal 
matrices with nonnegative entries on the main diagonals, and U and 
V are unitary matrices. Thus, by Theorem 4.6, 


A® B= (U*CU) ® (V*DV) = (U* @V*\(C@D)\(U®@V)>0. I 


Note that A and B in the theorem may have different sizes. 

Our next celebrated theorem of Schur on Hadamard products is 
used repeatedly in deriving matrix inequalities that involve Hadamard 
products of positive semidefinite matrices. 


Theorem 7.21 (Schur) Let A and B be n-square matrices. Then 
A>0,B>0 => AcoB>0 


and 
A>0,B>0 => AoB>0. 


Proof 1. Since the Hadamard product Ao B is a principal submatrix 
of the Kronecker product A ® B, which is positive semidefinite by 
the preceding theorem, the positive semidefiniteness of Ao B follows. 

For the positive definite case, it is sufficient to notice that a prin- 
cipal submatrix of a positive definite matrix is also positive definite. 


Proof 2. Write, by Theorem 7.3, A = U*U and B= V*V, and let u; 
and v; be the ith columns of matrices U and V, respectively. Then 


Aig = UjzUj = (Uj,Ui), dig = UZ Vy = (vy, %) 
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for each pair i and j, and thus (Problem 7, Section 4.3) 
AoB= ( aezbi;) = ((u;, us)(vj, v2)) _ ((u; ® U7, Uz ® vi)) > 0. 


Proof 3. Let A = )>7_, A;suiuz, where djs are the eigenvalues of A, 


a? 
thus nonnegative; u,;s are orthonormal column vectors. Denote by U; 


the diagonal matrix with the components of u; on the main diagonal 
of U;. Note that (ujux) o B = U;BU;. We have 


AoB= (37 vint) oB= Sie oB= 52 AU:BUS = 0. 
i=l i=1 i=1 


Proof 4. For vector x € C”, denote by diag x the n-square diagonal 
matrix with the components of x on the diagonal. We have 


z*(Ao B)x = tr(diagz* A diag x B*) 
= tr ((BY/?)” diagx* AVA? diag x (BY/)") 
=tr(A'? diag a (BY?)")" (A? diag (BY?)")>0. 


We are now ready to compare the pairs involving squares and 
inverses such as (Ao B)? and A?0 B?, and (AoB)~! and A7'o BB". 


Theorem 7.22 Let A>0 and B > 0 be of the same size. Then 
A? o B? > (Ao B)’. 
Moreover, if A and B are nonsingular, then 
A'oB!>(AoB)! and AoA'D>I. 


Proof. Let a; and b; be the ith columns of matrices A and B, 
respectively. It is easy to verify by a direct computation that 


(AA*) 0 (BB*) = (Ao B)(A* 0 B*) + S7(ai 0 j)(a¥ 0B). 
aml 
It follows that for any matrices A and B of the same size 


(AA*) o (BB*) > (Ao B)(A* 0 B*). 
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In particular, if A and B are positive semidefinite, then 
A? 0 B? > (Ao By. 


If A and B are nonsingular matrices, then, by Theorem 4.6, 


(A@B)1=A eB. 


Chap. 7 


Noticing that Ao B and A7! o B~! are principal submatrices of 
A® Band A~!@ B~! in the same position, respectively, we have by 
Theorem 7.13, with [|X] representing a principal submatrix of X, 


Ae B =((A@ B)) >[A@ Blt =(AcB)y. 


For the last inequality, replacing B with A~! in the above inequality, 


we get A~'o A> (Ao A~)—!, which implies (A~!0 A)? > I. 
the square roots of both sides reveals AT'o A> J. Wt 


Taking 


The last inequality can also be proven by induction on the size 
of the matrices as follows. Partition A and A~! conformally as 


fe & 1_({b B 
iol i) and A ae ae 
By inequality (7.11) 
~ 0 0 0 
_{ 6 = a > 
“| ¢ 5 )20 4 € a )28 


and by Theorem 7.20, 


which yields 
1 0 
= 
aoe = ( 0 Ajo AG ) 
An induction hypothesis on Ajo AS reveals Ao AT! > J. 


Theorem 7.23 Let A, B, and C be n-square matrices. 


A B 
>+ o* 
If & co) 20 then AoC > +BoB*. 
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Proof. 
A B CB AoC BoB* 
ec Ste Qe er ae Jt 


The desired inequalities are immediate from the fact (Problem 4) 
that ce =) >0<H>+K, where H >0Oand K is Hermitian. U8 


Our next result is an analogue of Theorem 7.16 of Section 7.3. 


Theorem 7.24 Let A be aknx kn positive semidefinite matrix par- 
titioned as A = (Aj;), where each Ajj is annxn matriz, 1 <i,j <k. 
Then the k x k matria T = (tr Ajj) is positive semidefinite. 


Proof 1. Let A = R*R, where R is nk-by-nk. Partition R = 
(Ri, Ro,..., Ry), where each R; is nk x n, i = 1,2,...,k. Then 
Ay = RPR;. Note that tr(A?R;) = (R;,Rj), am inner product of 
the space of nk x n matrices. Thus T = (tr(Aj;)) = (tr(R7R;)) = 
((R;, Ri)) isa Gram matrix. So T > 0. 


Proof 2. Let e, € C” denote the column vector with the tth compo- 
nent 1 and 0 elsewhere, t = 1,2,...,n. Then for any n x n matrix 
X = (xij), Le = ef Xez. Thus, tr Ajj = pie e; Aijet. Therefore, 


f=(tAg) = (do eter) = > (e; Aijet) 


t=1 t=1 
n n 

= So EMAy)E: = >| EAE, > 0, 
t=1 t=1 


where E; = diag(ez,...,e:) is nk x k, with k copies of e:. UW 


One can also prove the theorem using a similar idea by first ex- 
tracting the diagonal entries of each A;; through B = J, ® In, where 
Jy, is the k x k matrix all of whose entries are 1. Note that Ao B > 0. 


Problems 


1. Show that A>0 © tr(Ao B) > 0 for all B> 0, where A, BE M,. 
2. Let A>0O and B > 0 be of the same size. Show that 


A>B #= A®I>BOI S&S ABASBOB. 
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. Let A>0Oand B>0 be of the same size. Show that 


(a) tr(AB) < tr(A@B) =trAtrB < }(trA+trB)’. 
(b) tr(Ao B) < $tr(Ao A+ BoB). 

(c) tr(A®@ B) < ¢tr(A@A+ BOB). 

(d) det(A @ B) < $(det(A @ A) + det(B®@ B)). 


. Let H and K be n-square Hermitian matrices. Show that 


& nee ©& AS>tk. 


. Let A>0 and B > 0 be of the same size. Show that 


rank (Ao B) < rank (A) rank (B). 


Show further that if A > 0, then rank (Ao B) is equal to the number 
of nonzero diagonal entries of the matrix B. 


. Let A > 0 and let \ be any eigenvalue of A~! o A. Show that 


(a) \>1, (b) A14A>2I, (c) Apt0 A712 > (A A)“. 


. Let A, B, C, and D be nxn positive semidefinite matrices. Show that 


A>B => AoC>BoC, A®CLP> BEC; 


A>B,C>D = AcC>BoD, A®C>B@D. 


. Let A >0O and B > 0 be of the same size. Show that 


(Ase) <AeB alsa. 


. Prove or disprove that for A > 0 and B > 0 of the same size 


AoB®>(AoBy or AV? o BY? < (Aco BY. 
Show that for any square matrices A and B of the same size 
(Ac B)(A* 0 B*) < (021) 0 (AA’), 


where 0 = Omax(B) is the largest singular value of B. 


Let A and B be complex matrices of the same size. Show that 


(AA*) oI AoB +0 (AA*)0(BB*) AoB 0 
A*oB* (B*B)ol J} ~” A* o B* I aaa 
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12. 


13. 


14. 


15. 


16. 


Let A > 0 and B > 0 be of the same size. Let 2» be the largest 
eigenvalue of A and yu be the largest diagonal entry of B. Show that 


Apl AoB 
AoB<AoB< ul and ee Sut ) =o. 


Let A, B, C, D, X, Y, U, and V be n x n complex matrices and let 


AB eae A@X BEY 
u=(Z ee yj Mon= (287 Sen 


Show that MoN is a principal submatrix of MON and that MON is 
a principal submatrix of 1 ® N. Moreover, MON > 0 if M, N > 0. 


Let A = (aij) > 0. Show that Ao A = (aj;) > 0 and Ao A? = 
Ao A = (|aij|?) > 0. Show also that (a3,) > 0. How about (a;;|*) 
and (,/Ja;;|)? Show, however, that the matrix A= (ja:;|) is not 
necessarily positive semidefinite as one checks for 


1 a 0 -a 
a 1a 0 1 
a= 0 alia a 4/2 
-a 0a 1 
Let Ai, A2,---,An be positive numbers. Use Cauchy matrices to show 


that the following matrices are positive semidefinite. 


1 1 iy 
N+ A; , Nir; , Ni + Aj : 

1 2 
(eam) (va); Gar). 
(Gane) (SK) eos) 
MOAL+AVNAR)” LAF AV)? (WH AZ” 


Let A = (Ajj) be an nk x nk partitioned matrix, where each block 
Ajj isn x n. If A is positive semidefinite, show that the matrices 
C = (Ci, (Ai;)) and E = (E,(Ai;)) are positive semidefinite, where 
Cm(X) and E,,(X), 1 <m <n, denote the mth compound matrix 
and the mth elementary symmetric function of n x n matrix X, re- 
spectively. Deduce that the positivity of A = (A;;) implies that of 
D = (det(Aj;)). [Hint: See Section 4.4 on compound matrices.| 
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7.6 Schur Complements and the Hadamard Product 


The goal of this section is to obtain some inequalities for matrix sums 
and Hadamard products using Schur complements. 

As we saw earlier (Theorem 7.7 and Theorem 7.22), for any pos- 
itive definite matrices A and B of the same size 


(A+ B)'< 


<5(41 +B) 


and 
(Ao oe <A‘ RO, 


These are special cases of the next theorem whose proof uses the fact 


A B 
= : 


Theorem 7.25 Let A and B be n-square positive definite matrices, 
and let C and D be any matrices of sizem xn. Then 


(C+ D\(A+ By (C+ DY < CA“"C* + DBD", (7:23) 
(Co D)(Ao B)-1(Co D)* < (CA“"C*) 0 (DB D*). (7.24) 


Proof. Note that X >0,Y >0, X +Y >0, and X oY > 0, where 


A C* B D* 
x=(4 CA~'C* ) y=(5 DBD ). 
The inequalities are immediate by taking the Schur complement of 
the (1,1)-block in X + Y >0 and X oY > 0, respectively. UW 
An alternative approach to proving (7.24) is to use Theorem 7.15 


with the observation that X oY is a principal submatrix of X @Y. 
By taking A = B = [, in (7.23) and (7.24), we have 


“(C +D)\(C + D)* <CC* + DD" 
and 


(Co D)(Co D)* < (CC*) 0 (DD*). 
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Theorem 7.26 Let A and B be positive definite matrices of the 
same size partitioned conformally as 


Ai Aj ) ( By Bip ) 
A= , B= 
( Ao Ao Boy Bog 


Aut By > Ant+ Bu (7.25) 
Aj, 0 By > Ayo Bi. (7.26) 
Proof. Let 


A= ( Ait Aj2 ) Be ( By Bi2 ) 
Ag, Agi Ay Ai )’ By, By By Bi )’ 
The inequality (7.25) is obtained by taking the Schur complement of 
Ay + By, in A+ B and using (7.23). For (7.26), notice that 


Aga > An Ay Ai, Bag > Bo By Bis. 
Therefore, 
Age © (Bo By’ Biz) + Boz 0 (Az1 Az A12) 
> 2( (Aor Aj Ai2) 0 (Bo By Biz)). 
It follows that 


AyoBy = (Age — Aoi Ay, A12) © (Bog — Bo By’ Biz) 
Ag 0 Bog — (Ag1 Az] A12) © (Bai By’ Biz): 


/\ 


Applying (7.24) to the right-hand side of the above inequality yields 
Age © Bgg — (Aoi Az Az) © (Ba By’ Biz) 
< Ag2 0 Byg — (Agi © Ba1)(Ai1 0 Bi) ~* (Ai © Biz) 
= Aj, 0 By. 


Thus oe ee aes 
Aj, 0° By < Aj, o By. OF 
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We end this section with a determinant inequality of Oppenheim. 
The proof uses a Schur complement technique. 
As we recall, for A > 0 and any principal submatrix Aj, of A 


det A = det Aj, det Aaa: 


Theorem 7.27 (Oppenheim) Let A and B be nxn positive semi- 
definite matrices with diagonal entries ay, and bj, respectively. Then 


II aby > det(A o B) > ay1-+- Any det B > det Adet B. 
i=1 


Proof. The first and last inequalities are immediate from the Hadamard 
determinant inequality. We show the second inequality. 

Let B be as in the proof of the preceding theorem. Consider AoB 
this time and use induction on n, the order of the matrices. 

If n = 2, then it is obvious. Suppose n > 2. Notice that 


Bu By Big = Boo — Bu. 
Take the Schur complement of A,; 0 By; in Ao B to get 
Age © (Boa — B11) — (Aa1 © Bo1)(At1 0 Bi1)71(Aiz © Biz) > 0 
or 
Ago 0 By — (Agi 0 Boi)(At1 0 Bi1)71(Atz 0 Big) > Avo 0 Bi. 


Observe that the left-hand side of the above inequality is the Schur 
complement of A;; 0 By; in Ao B. By taking determinants, we have 


det(Ay, o By) > det(A22 fe} Bin): 
Multiply both sides by det(Aj1 0 By1) to obtain 
det(A ie) B) a det(Aq1 ie) By) det(A22 ie) Biv 


The assertion then follows from the induction hypothesis on the two 
determinants on the right-hand side. 
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Let A;(X) be the eigenvalues of nxn matrix X,i=1,...,n. The 
inequalities in Theorem 7.27 are rewritten in terms of eigenvalues as 


[Labs = [Putts B) > [lana(B) 2 [p(4a) = [[(4)(4). 


i=l i=l 


Problems 


1. A correlation matrix is a positive semidefinite matrix all of whose 
diagonal entries are equal to 1. Let A be an nxn positive semidefinite 
matrix. Show that min, det(AoX) = det A, where the minimal value 
is taken over all n x n correlation matrices X. 


2. Let A> 0. Use the identity det Au = wid to show that 


det(A + B) det A rf det B 
det(Ai1 + By) —~ det Aq, det By 


3. Let A, B, and A+ B be invertible matrices. Find the inverse of 


ia a). Use the Schur complement of A+ B to verify that 


A—A(A+B)'A=(4'4+ 8B). 
4. Let A > O0and B > 0 be of the same size. Show that for all x, y € C”, 
(29+ y)*(A+ B)TV(a+y)<a*Alet+y*Boly 


and 
(x0 y)*(Ao BY Mw oy) < (a*Antx) 0 (y*Bo'y). 


5. Show that for any m x n complex matrices A and B 


AYA At BB BY 
(Ai a, eo (Ar i) 


Derive the following inequalities using the Schur complement: 
Im = A(A*A)~*A*, if rank (A) =n; 


(A* A) 0 (B*B) > (A* 0 B*)(Ao B): 
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. Let A € M,, be a positive definite matrix. Show that for any B € M,, 


Ao (B*A1B) BeoB +0 
B*oB Ao(B*A 1B) }- 


In particular, 
I AoA! 
Derive the following inequalities using the Schur complement: 
(AgA-*\ 4" <AcA-* 
det(B* o B) < det (Ao (B*A'B)); 
(tr(B* o B)?)”? < tr(Ao(B*A™'B)); 
19 BB > (Bb shi o BB) (BoB) 


if B has no zero row or column. Discuss the analogue for sum (+). 


=f 
Ge I ) 20. 


. Let A, B, and C be n x n complex matrices such that e :) > 0. 


With x denoting the sum + or the Hadamard product 0, show that 


1/2 


(tr(B* x B)’)'/" < tr(A*C) 


and 
det(B* x B) < det(AxC). 


. Let A be a positive definite matrix partitioned as ‘ee a, , where 


Aj; and Agg are square matrices (maybe of different sizes). Show that 


A Aj 0 
Aedes [OR oy | Se 
0 Ay, 0° Ait 


Conclude that Ao A~! > I. Similarly show that A? 0 A~! > J. 


. Let A, and Az be n x n real contractive matrices. Show that. 


(a) det(I — A;) > 0 for i = 1,2. 

(b) H = (hi;) > 0, where hy; = 1/ det(I — AF A;), 1 < i,j < 2. 

(c) L = (lj) > 0, where l;; = 1/det(I — A*A;)*, 1 < i,j < 2, and 
k is any positive integer. 


(Hint: Use the Hua determinant inequality in Section 7.4.] 


© 
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7.7 The Wielandt and Kantorovich Inequalities 


The Cauchy—Schwarz inequality is one of the most useful and funda- 
mental inequalities in mathematics. It states that for any vectors x 
and y in an inner product vector space with inner product (-, -), 


laa’ < (ea) Cyy) 


and equality holds if and only if x and y are linearly dependent. Thus 
ly*2|? < (a*2)(y*y) 


for all column vectors x,y € C”. An easy proof of this is to observe 
a a 
eurtew)=(% )eu= (22 2) > 0. 


In this section we give a refined version of the Cauchy—Schwarz in- 
equality, show a Cauchy—Schwarz inequality involving matrices, and 
present the Wielandt and Kantorovich inequalities. 


Lemma 7.1 Let A= (; 2 be a nonzero 2 x 2 positive semidefinite 


be 
matrix with eigenvalues a and B, a>. Then 
2 
pee (Pa? | ae 2 
prs (<=) ac (7.27) 


Equality holds if and only ifa=c or 6B =0, 1.e., A is singular. 


Proof. Solving the equation det(\J — A) = 0 reveals the eigenvalues 


(a+c)+/(a—c)? + 4|b/? 
5 


a, B= 
Computing a and squaring it, we get an equivalent form of (7.27), 
(a — c)*(ac — |b|?) > 0. 


The conclusions follow immediately & 
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Theorem 7.28 Let x and y be vectors in an inner product space. 


La A= Cee oe have eigenvalues a and 8B, a> B,a>0. Then 


2 a—B 
Ko? s (S55 


Equality holds if and only if x and y have the same length (1.e., 
\|x|| = |ly||) or are linearly dependent (i.e., A is singular or B = 0). 


2 
) eDbia: (7.28) 


Proof. The inequality follows from the lemma at once. For the 
equality case, we may assume that ||y|| = 1. It is sufficient to notice 
that (« — ty,x2 — ty) =0; that is, x =ty, whent=(2z,y). J 

We proceed to derive more related inequalities in which matrices 
are involved. To this end, we use the fact that if0<r<p<q<-s, 


q-—p s—r e . t-—1 . . 5 ° ° 
then rae ae = This is because a1 iS an increasing function. 


Theorem 7.29 Let A, B, and C be n-square matrices such that 


Denote by M the (nonzero) partitioned matrix and let a and 8 be the 
largest and smallest eigenvalues of M, respectively. Then 


\(Be,y)P < (S35 


2 
a) (Az,z)(Cy,y), x, yeC™. — (7.29) 


Proof. We may assume that x and y are unit (column) vectors. Let 


af & 0 2 0  f @ Ae ae’y 
v=(% re G y= (pas oer 


Then N is a 2 x 2 positive semidefinite matrix. Let \ and p be the 
eigenvalues of N with A > yu. By Lemma 7.1, we have 


2 
(Bx yP < (SS4) (ar.2\(Cny) 


To get the desired inequality, with GI < M < al, pre- and post- 
multiplying by (aa) and (aa) respectively, as (a2) (a) =, 
we obtain that GBI < N < al. It follows that 6 < w<rA<a. 

A= < a-B 


Consequently, 0 < Fe ae: Inequality (7.29) then follows. i 
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Theorem 7.30 (Weilandt) Let A be annxn positive semidefinite 
matriz and A, and dy, be the largest and smallest eigenvalues of A, 
respectively. Then for all orthogonal n-column vectors x andy € C", 


Ai — An 
Ai + An 


jo" Ayl? < ( ) @Aniy'av) 


Proof 1. Let M = (z,y)*A(z,y). M is positive semidefinite and 


_ f wae any 
“= ( yrAx y* Ay ). 


Because A,,J < A < x7, multiplying by (x, y)* from the left and by 
(x,y) from the right, as (a, y)*(z, y) = I2, we have A»nI < M < Ail. 
If a and £ are the eigenvalues of the 2 x 2 matrix M with a > £, 
a> 0, then A, < B<a< A. By Lemma 7.1, we have 


*A 2e a—p 
|y* Aa < (25 


Using the fact that anf < ae, we obtain the inequality. 


2 
) (x An)(y" Ay). 


Proof 2. Let x and y be orthogonal unit vectors and let @ be a real 
number such that e’?(Ay, x) = |(Ay, x)| = |x* Ay| = |y*Aa|. Because 
Anl < A < A I, we have, for any complex number c, 


An||@ + cyll? < (A(@ + cy), @ + cy) < Aalle + cyll?. 
Expanding the inequalities and setting c = te’®, t € R, we have 
t?(y* Ay — An) + 2t|x* Ay| + x* Ar — An > (7.30) 


and 
#7(A, — y* Ay) + 2t|x* Ay| + A, — 2* Ax > 0. (7.31) 


Multiply (7.30) by A; and (7.31) by Ap; then add to get 
t7(\1 — An)y* Ay + 2t(A1 + An) |2* Ay| + (Ar — An)2* Az > 0. 
Since this is true for all real t, by taking the discriminant, we have 


(Ar + An)?|a* Ayl? < (A1 — An)*(2* Aa) (y*Ay). 
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In the classical Cauchy—Schwarz inequality |y*z| < (a*x)(y*y), if 
we substitute x and y with A!/2x and A~!/2y, respectively, where A 
is an n-square positive definite matrix, we then have 


ly*a|? < (a*Ax)(y*A~*y). 
In particular, for any unit column vector x, we obtain 
1 < (x* Ax)(2*A7'2). 


A reversal of this is the well-known Kantorovich inequality, which 
is a special case of the following more general result. 

Let f : M, + My be a linear transformation. f is said to be 
positive if f(A) > 0 whenever A > 0; f is strictly positive if f(A) > 0 
when A > 0; and f is unital if f(In) = Ip. Here are some examples: 


1. f: Ate trA is strictly positive from M,, to My = C. 
2. g: Aty X*AX is positive, where X is a fixed n x k matrix. 


3. h: A+t> Ag is positive and unital, where A, is the k x k leading 
principal submatrix of A. 


4.p: At+> A@X andqg: At AcoX are both positive, where X 
and Y are positive semidefinite matrices. 


Theorem 7.31 Let f be strictly positive and unital and A be a pos- 
itive definite matrix. Let a and 6 be positive numbers, a < 8, such 
that all the eigenvalues of A are contained in the interval [a, B|. Then 


Proof. Since all the eigenvalues of A are contained in [a, 3], the 
matrices A—al and GI — A are both positive semidefinite. As they 
commute, (A — aI)(6I — A) > 0. This implies 


apl<(@+6\A=2* or aBA* alate) =A. 
Applying f, we have 
aBf(Aq") < (at B)I— f(A). 
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For any real numbers ¢ and 2, (c — 2x) > 0. It follows that 
c-a2“2< acu for any real c and positive number x. Thus 


aBf(A~*) < (a+ B)I— f(A) < 


If we take f in the theorem to be f(A) = x* Az, where xz € C” is 
a unit vector, then we have the Kantorovich inequality. 


Theorem 7.32 (Kantorovich) Let A © M,, be positive definite 
and X41, An be its largest and smallest eigenvalues, respectively. Then 


2 
< aa o*e = 1, (7.32) 
1An 


(a* Ax)(a* A~12) 

The Kantorovich inequality has made appearances in a variety 

of forms. A matrix version is as follows. Let A € M,, be a positive 
definite matrix. Then for any n x m matrix X satisfying X*X = In, 


(Ay + Nee 


che At eS 
(X* AX) < ese 


(XPAxy, 

The first inequality is proven by noting that [-Y(Y*Y)~!Y* > 0 
for any matrix Y with columns linearly independent. For the inequal- 
ities on the Hadamard product of positive definite matrices, we have 

d 2 
(Ao B)1 <A“ opt < ea ° By, 
AK 
where is the largest and p is the smallest eigenvalue of A @ B. 
We leave the proofs to the reader (Problems 18 and 19). 


Problems 


1. Let r > 1. Show that for any positive number t such that + SES 7; 
1 1 
be SN Pe 
t r 
2. Let 0 <a < b. Show that for any z € [a, }], 


Leatbh & 
x” ab ab 
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. Let «, y € C” be unit vectors. Show that the eigenvalues of the 


matrix ( oe are contained in [1 — t,1+ ¢], where t = |(a, y)]. 


. Let A= (3 2) be a nonzero 2 x 2 Hermitian matrix with (necessarily 


real) eigenvalues a and 8, a > 6. Show that 2|b| < a — £. 


. Let A be an n x n positive definite matrix. Show that 


ly*a|? = (a* Ax)(y*A~*y) 


if and only if y = 0 or Ax = cy for some constant c. 


. Let A1,.--,An be positive numbers. Show that for any x, y € C”, 


n 
s LiYi 
i=l 


2 n n 
< (> Ale) es Awe). 
w=1 i=1 


. Show that for positive numbers aj, a@2,...,@,, and any t € [0,1], 


(Ee) <(Ze)(Le). 


Equality occurs if and only if t = $ or all the a; are equal. 


. Let A € M,, be positive semidefinite. Show that for any unit « € C” 


(Ax, x)? < (A?z, 2). 


. Let A, B, and C be nxn matrices. Assume that A and C are positive 


definite. Show that the following statements are equivalent. 


(a) (42) 20. 
(8). Amal BA B*C—) <1, 
(c) |(Ba,y)| < 3 ((Az, 2) + (Cy, y)) for all x, ye C”. 
Show that for any n x n matrix A >0,m xn matrix B, 2, y € C”, 
|(Ba,y)? < (Az, 2)(BA~*B*y, y) 


and that for any m x n matrices A, B, and « € CC”, yEC™, 


|((A + B)2,y))? < (1+ A*A)e, 2) (I+ BB*)y,y). 
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11. Let A be an n x n positive definite matrix. Show that for all x 4 0, 


iz (z* Ax)(a* Az) Z At 
~ (ata)? ite 


12. Let A and B be n x n Hermitian. If A < B or B < A, show that 


AoB<-~(AcoA+BoB). 


Nile 


13. Let A and B be n x n positive definite matrices with eigenvalues 
contained in [m, M], where 0 <m < M. Show that for any t € [0, 1], 


tA? + (1 —t)B? — (tA + (1—2)B)? < <(M—m)?I. 


Ble 


14. Show the Kantorovich inequality by the Wielandt inequality with 
y = ||2||?(A“*x) — («* A"). 
15. Show the Kantorovich inequality following the line: 
(a) IfO<m<t<M, then0<(m+M-—t)t—mM. 
(b) HO<m<Aj< M,t=1,..4,m, and >, |a|? =, then 


(5) 2 pee eke) 
A} ~ mM 


where S(+) = 74 > z|? and S(A) = 77, Ailz;|?. 


(c) The Kantorovich inequality follows from the inequality 


1\ — (m+M/ 
= ~4mM 


16. Let A be an n x n positive definite matrix having the largest eigen- 
value A; and the smallest eigenvalue A,,. Show that 


max = 
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18. Let A be an n x n positive definite matrix with A; = Amax(A) and 
An = Amin(A). Show that for all n x m matrices X, X*X = Im, 


(Ai + An)? 


* Sle * A—1 eG 
OOAx) = tA xs Da. 


(X* AX)! 

Use the first one to derive the inequality for principal submatrices: 
[A}~* < [A~*] 

and then show the following matrix inequalities. 

(a) X*AX — (X*A7EX)? < (Var — Vn )?T. 
(b) (X*ANY eX aX < Orban)’ (xX AX). 
) 

) 


@) PAX = COAx? < Oey, 
(d) Ax = OA =< pe (X* AX). 


(ec) (X*A2X)1/2 — X*AX < Qudel 1, 


19. Let A and B be n x n positive definite matrices. Show that 


Amax(A ® B) = Amax(A)Amax(B) (denoted by 2) 


and 
Amin(A ® B) = Amin(A)Amin(B) (denoted by p). 

Derive the following inequalities from the previous problem. 
a) (AoB)!<AloB1< buy (Ao B)-} 
(0) AoB— (Ato B11 < (VX~— yt 

c) (Ao B)? < A? 0B? < (ray (Ao B)?. 
(d) (Ao B)?— A? 0B? < ony 

JA B < (A? 0 B?)? < SH AoB. 


26 B2)1/2 _ Q=n)? 
f) (A* 0 B*) AoBs< Tote 


CHAPTER 8 


Hermitian Matrices 


Introduction: This chapter contains fundamental results of Hermi- 
tian matrices and demonstrates the basic techniques used to derive 
the results. Section 8.1 presents equivalent conditions to matrix Her- 
mitity, Section 8.2 gives some trace inequalities and discusses a nec- 
essary and sufficient condition for a square matrix to be a product of 
two Hermitian matrices, and Section 8.3 develops the min-max the- 
orem and the interlacing theorem for eigenvalues. Section 8.4 deals 
with the eigenvalue and singular value inequalities for the sum of Her- 
mitian matrices, and Section 8.5 shows a matrix triangle inequality. 


8.1 Hermitian Matrices and Their Inertias 


A square complex matrix A is said to be Hermitian if A is equal to 
its transpose conjugate, symbolically, A* = A. 


Theorem 8.1 An n-square complex matrix A is Hermitian if and 
only if there exists a unitary matrix U such that 


A=U* diag(M1,...,An)U, (8.1) 


where the A; are real numbers (and they are the eigenvalues of A). 


F. Zhang, Matrix Theory: Basic Results and Techniques, Universitext, 253 
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In other words, A is Hermitian if and only if A is unitarily similar 
to a real diagonal matrix. This is the Hermitian case of the spectral 
decomposition theorem (Theorem 3.4). The decomposition (8.1) is 
often used when a trace or norm inequality is under investigation. 


Theorem 8.2 The following statements for A€ My, are equivalent. 


1. A is Hermitian. 

2. «* Ax ER for all x € C. 
GAP = AA, 

4. tr A? = +tr(A* A). 


We show that (1)(2) and (1)#(8). (1) (4) is similar. It is not 
difficult to see that (1) and (2) are equivalent, because a complex 
number a is real if and only if a* = a and (Problem 16) 


AP=A © 2g (A*—A)x=0 forall xe C”. 


We present four different proofs for (3)=(1), each of which shows 
a common technique of linear algebra and matrix theory. The first 
proof gives (4)=(1) immediately. Other implications are obvious. 


Proof 1. Use Schur decomposition. Write A = U*TU, where U is 
unitary and T is upper-triangular with the eigenvalues A1,...,An of 
A on the main diagonal. Then A? = A*A implies T? = T*T. 

By comparison of the main diagonal entries of the matrices on 
both sides of T? = T*T, we have, for each j = 1,...,n, 


AF = As? + D2 |tiyl?. 
<j 


It follows that each A; is real and that t;; = 0 whenever i < 7. 
Therefore, T is real diagonal, and thus 


AHA yv eC Rr =WVWL =A, 
The trace identity trT? = tr(T*T) yields (4)=-(1) in the same way. 


Proof 2. Use the fact that tr(X X*) =0 <= X =0. We show that 
tr(A — A*)(A — A*)* = 0 to conclude that A — A* =0 or A= A*. 
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Upon computation we have 
(A — A*)(A — A*)* = AA* — A +ATA= (Ace 
which reveals, by using tr(AA*) = tr(A*A) and A*A = A? = (A*)?, 


tr(A — A*)(A — A*)* =0. 


Proof 3. Use eigenvalues. Let B = i(A— A*). Then B is Hermitian. 
We show that B has only zero eigenvalues; consequently, B = 0. 
Suppose A is a nonzero eigenvalue of B with eigenvector z: 


Bau=dr, AF0, «© F0. 
Note that the condition A*A = A? implies BA = 0. We have 
AA =A Br)=( BA) r=, 
Thus, A*x = 0. But Bx = Ax yields A*x = Ax + iAz. Therefore, 
O=2* Ate = 2" Art ide*s = xt Ate t+ ide*s = idee. 
It follows that A = 0, a contradiction to the assumption A ¥ 0. 
Proof 4. Use inner product. Note that (Problem 16, Section 1.4) 
C” = Ker A* 6 Im A. 


Thus, to show A* = A, it suffices to show A*x = Az for every x € 
Ker A* and x € ImA. If x € Ker A*, then, by A? = (A*)? = A*A, 


(Aa, Ax) = (A* Aa, x) = ((A*)* 2, 2) = 0. 


This forces Ax = 0; namely, Ax = A*zx for every x € Ker A*. If 
x € ImA, write x = Ay, y € C”. We then have 


A*x = (A*A)y = A?y = A(Ay) = Az. OW 


Let A be an n x n Hermitian matrix. The inertia of A is defined 
to be the ordered triple (i,(A),i_(A),io(A)), where i,(A), i_(A), 
and ig(A) are the numbers of positive, negative, and zero eigenvalues 
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of A, respectively (including multiplicities). Denote the inertia of A 
by In(A), ie., In(A) = (¢4(A), i_(A), io(A)), or simply (24, 7_, 20) if 
no confusion is caused. Obviously, rank (A) = i,(A) + i_(A). 

The inertias of a nonsingular Hermitian matrix and its inverse are 
the same since their (necessarily nonzero) eigenvalues are reciprocals 
of each other. The inertias of similar Hermitian matrices are the same 
because their eigenvalues are identical. The inertias of *-congruent 
matrices are also the same; this is Sylvester’s law of inertia. We say 
that two n x n complex matrices A and B are x*-congruent if there 
exists a nonsingular n x n matrix S such that B = $*AS(= BTBS). 


Theorem 8.3 (Sylvester’s Law of Inertia) Let A and B be Her- 
mitian matrices of the same size. Then A and B are *-congruent if 
and only if they have the same inertia; that is, In(A) = In(B). 


Proof. The spectral theorem ensures that there are positive diagonal 
matrices FE and F with respective sizes 1;(A) and i_ (A) such that 
A is unitarily similar (*-congruent) to EF 6 (-F’) © 0,4). Setting 
G=E 1°99 F126 [.(4) and upon computation, we have 


G* (EZ ® (—F) ® 04) G = Li, (ay ® (—Li_cay) © Dinca): 


A similar argument shows that B is *-congruent to J;,(3)8(—Li_(p))® 
0i9(B)- If In(A) = In(B), transitivity of *-congruence implies that A 
and B are *-congruent, i.e., B = S* AS for some nonsingular S. 

For the converse, suppose that B = S* AS for some nonsingular 
matrix S. Let A = U*MU and B = VNV*, where U and V are 
nonsingular matrices, M = Ij, (4) ® (—Ji_(a)) © Oi(4) and N = 
Ti,(B) ® (-Ji_(p)) ® Ojg(p)- Then B = S*AS' implies that N = 
W* MW, where W = USV is a nonsingular matrix. Denote the first 
i,(A) rows of W by W, and the rest of the rows by W2. Then N = 
W* MW reveals N = Wi*W, — W3(Ii_(a) 8 0)W2. So N < Wi*Wj. 
Since N = I;,(p)®(—Ji_(py) ®0j.(B), the leading principal submatrix 
of W,*W, corresponding to J;, (g) is positive definite. Thus 7,(B) < 
rank (W,*W,) = rank (W1) = 74(A). It follows that i_(B) > i_(A) 
as A and B have the same rank. On the other hand, applying the 
above argument to —A and —B and noting that i,(—H) = i_(H) 
for any Hermitian matrix H, we conclude that In(A)=In(B). 


Below is a result on Schur complements of Hermitian matrices. 
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Theorem 8.4 Let A_be Hermitian, Aj, be a nonsingular principal 
submatrix of A, and Aj, be the Schur complement of Ay, in A. Then 


In(A) = In(Aq1) + In(Aq,). 


Proof. By permutation similarity (if necessary), we may assume that 


-1 
A= ( Ay Arp ) and define G= ( ; ae ) . 


GAG = — |. 
( 0 An ) 


This says the eigenvalues of G*AG are those of Aj; and Aaa or 
In(G* AG) = In(Ai1) + In(A11), where the triples are added as vec- 
tors. The conclusion follows from Sylvester’s law of inertia. I 


Problems 


1. What are the differences in the spectral decompositions of normal, 
Hermitian, positive semidefinite, and unitary matrices? 


2. Show that the diagonal entries of a Hermitian matrix are all real. 
3. Show that A* + A and A*A are Hermitian for any square matrix A. 


4. Show that if A and B are Hermitian matrices of the same size, then 
so are A+ Band A— B. What about AB and ABA? 


5. Let A be an n-square Hermitian matrix. Show that C*AC is also 
Hermitian for any n x m complex matrix C. 


6. Show that if A and B are Hermitian matrices of the same size, then 
AB=0 S BA=0. What if A and B are not Hermitian? 


7. Is a matrix similar to a Hermitian matrix necessarily Hermitian? 
What if unitary similarity is assumed? 


8. Show that if matrix A is skew-Hermitian, that is, A* = —A, then 
A = iB for some Hermitian matrix B. 


9. Let A be an n x n positive definite matrix and B be an n x n skew- 
Hermitian matrix. Show that the rank of B* AB is an even number. 
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Let A be a nonsingular skew-Hermitian matrix. Show that A? + A~+ 
is nonsingular and that B = (A? — A~+)(A? + A~!)~? is unitary. 


Show that a square complex matrix A can be uniquely written as 
A=B+iC=S-ifT, 

where B and C are Hermitian, and S and T are skew-Hermitian. 

Show directly the implication (4)=(1) in Theorem 8.2. 

If A is Hermitian, show that A? is positive semidefinite. 


Find a unitary matrix U such that U* HU is diagonal, where 


1 -i 
n=(1 5). 
Let A and B be Hermitian matrices of the same size. If AB — BA 


and A— B commute, show that A and B commute. 


Let A € M,,. Show that A is Hermitian if and only if 
(Az,y) = (2, Ay), 2, yeC", 
and if and only if 
(Az, x) =(#,Axr), x eC”. 
Is it true that A is Hermitian if 


(Av,x) =(a,Ar), xeER"? 


Let A be an n-square complex matrix. Show that 


(a) tr(AX) = 0 for all Hermitian X € M,, if and only if A = 0. 
(b) tr(AX) € R for all Hermitian X € M,, if and only if A= A*. 


(c) If Ais Hermitian and tr A > Retr(AU) for all unitary U € M,, 
then A > 0. 


Show that the rank of a Hermitian matrix is the same as the number 
of nonzero eigenvalues of the matrix and that the rank of a general 
matrix A equals the number of nonzero singular values of A, but not 
the number of nonzero eigenvalues (in general). 


Show that if A is a Hermitian matrix of rank r, then A has a nonsin- 
gular principal submatrix of order 7. How about a general matrix? 
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20 


21. 


22. 


23. 


24. 


25. 


26. 


2. 


28. 


29. 


30. 


Let A be a Hermitian matrix with rank r. Show that all nonzero 
r X r principal minors of A have the same sign. 


Let A = A* and tr A = 0. If the sum of all 2 x 2 principal minors of 
A is zero, show that A = 0. [Hint: Use Problem 16 of Section 5.4.] 


Show that the numbers of positive, negative, and zero eigenvalues in 
Theorem 8.1 do not depend on the choice of unitary matrix U. 


Find the rank and inertia of each of following matrices: 
0 1 1 1 01 
1 0 0 |, 0 2 0 
1 0 O 101 


Let A be a Hermitian matrix and B be a principal submatrix of A. 
Show that 14 (B) < iz(A). Is it true that i9(B) < ip(A)? 


Let A and B be Hermitian matrices of the same size. Show that 
if A < B then i-(A) < i+(B) and that if A < B and rank(A) = 
rank (B) then A and B are *-congruent. 


Let A be an nxn Hermitian matrix. Show that for any n x m matrix 
Q with rank r, i_(Q* AQ) < r and i+(Q* AQ) <r. 


Let A and B be nxn nonsingular Hermitian matrices. If the smallest 
eigenvalue Amin(B~!A) > 0, show that A> Be B-1> An. 


Compute the inertias of the following partitioned matrices in which 
I is the n x n identity matrix and A and B are any n x n matrices: 


IO Io. I+A*A I+A*B 
0 -I }’ I 0)? I+B*A I+B*B )- 


Show that for any m x n complex matrix A, 


in ( - : ) = (m, 0,0)+In(I,, — A* A) = (n,0,0)+In(Im —AA*). 


Show that two unitary matrices are *-congruent if and only if they 
are similar, i.e., if U and V are unitary, then U = W*VW for some 
nonsingular W if and only if U = R~'VR for some nonsingular R. 
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8.2 The Product of Hermitian Matrices 


This section concerns the product of two Hermitian matrices. As 
is known, the product of two Hermitian matrices is not necessarily 
Hermitian in general. For instance, take 


1 O 1 1 
FAG yh Ol 2 ce 
Note that the eigenvalues of AB are the nonreal numbers 3(3+ V7 i). 
We first show a trace inequality of the product of two Hermi- 
tian matrices, and then we turn our attention to discussing when a 
matrix product is Hermitian. Note that the trace of a product of 


two Hermitian matrices is always real although the product is not 
Hermitian. This is seen as follows. If A and B are Hermitian, then 


tr(AB) = tr(A* B*) = tr(BA)* = tr(BA) = tr(AB). 
That is, tr(AB) is real. Is this true for three Hermitian matrices? 
Theorem 8.5 Let A and B be n-square Hermitian matrices. Then 
tr(AB)? < tr(A?B?). (8.2) 
Equality occurs if and only if A and B commute; namely, AB = BA. 


Proof 1. Let C= AB — BA. Using the fact that tr(XY) = tr(Y X) 
for any square matrices X and Y of the same size, we compute 


w(C*C) tr(BA — AB)(AB — BA) 
tr(B.A’B) + tr(AB?A) — tr(BABA) — tr(ABAB) 


2tr( AB") = Vir( AB). 


Note that tr(A?.B?) = tr(AB?4A) is real because AB? A is Hermitian. 
Thus tr(AB)? is real. The inequality (8.2) then follows immediately 
from the fact that tr(C*C) > 0 with equality if and only if C = 0; 
that is, AB = BA. 
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Proof 2. Since for any unitary matrix U € M,,, U* AU is also Her- 
mitian, inequality (8.2) holds if and only if 


2 
tr ((U*AU)B) <tr ((U*AU)?B?). 
Thus we assume A = diag(a1,...,@,) by Schur decomposition. Then 


tr(A?B?) — tr(AB)? = Ss" a? |bi;|? a » aja; |bi; |" 
i,j i,j 


S (a a,)"[bg/? > 0 


i<j 
with equality if and only if ajbj; = ajbj;. This implies AB = BA. 


Proof 3 for the Equality Case. We use the fact that a matrix X is 
Hermitian if and only if tr X? = tr(X X*) (see Theorem 7.2(4)). 
Notice that the Hermitity of A and B gives 


tr(A?B?) = tr(ABBA) = tr(AB)(AB)*. 
Thus, 
tr(AB)? = tr(A?B?) = tr(AB)? = tr(AB)(AB)*. 
It follows that AB is Hermitian. Hence, 


AB =(AB)* = B*A*=BA. & 


Clearly, the product of two Hermitian matrices is Hermitian if 
and only if these two matrices commute (Problem 4). We are now 
interested in the following question. When can a given matrix be 
written as a product of two Hermitian matrices? 

Let A be given. Suppose A = BC is a product of two Hermitian 
matrices B and C of the same size. If B is nonsingular, then 


A=BC=B(CB)B=BAB- 


namely, A is similar to A*. This is in fact a necessary and sufficient 
condition for a matrix to be a product of two Hermitian matrices. 
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Theorem 8.6 A square matrix A is a product of two Hermitian 
matrices if and only if A is similar to A*. 


Proof. Necessity: Let A = BC, where B and C are n-square Her- 
mitian matrices. Then we have at once 


AB=BCB= BBC) =BzA 
and inductively for every positive integer k 
A®B = B(A*)*. (8.3) 
We may write, without loss of generality via similarity (Problem 7), 
J 0 
4=(4 &): 


where J and K contain the Jordan blocks of eigenvalues 0 and 
nonzero, respectively. Note that J is nilpotent and K is invertible. 
Partition B and C conformally with A as 


_{ L M fF @ 
e=(ar vw) e=(@ 2): 


Then (8.3) implies that for each positive integer k 


Notice that (J*)* = 0 when k > n, for J is nilpotent. It follows that 
M =0, since K is nonsingular. Thus A = BC is the same as 


(a x)= (0 x )(o &) 


This yields kK = NR, and hence N and R are nonsingular. 
Taking k = 1 in (8.3), we have 


J 0 bP} ££ 0 J* 0 
0 K 0N/) \O WN O K* }? 
which gives KN = NK%, or, because N is invertible, 


N-'KN = k*. 


Sec. 8.2 The Product of Hermitian Matrices 263 


In other words, K is similar to K*. On the other hand, any square 
matrix is similar to its transpose (Theorem 3.14(1)). Thus J is sim- 
ilar to J7 = J*, and it follows that A is similar to A*. 

Sufficiency: We show that if A is similar to A*, then A can be 
expressed as a product of two Hermitian matrices. Notice that 


A=P"fiibP = A=P "APP AP. 


This says if A is similar to a product of Hermitian matrices, then A 
is in fact a product of Hermitian matrices. 

Recall from Theorem 3.14(2) that A is similar to A* if and only if 
the Jordan blocks of the nonreal eigenvalues A of A occur in conjugate 
pairs. Thus it is sufficient to show that the paired Jordan block 


("0 ae) 


where J(A) is a Jordan block with on the diagonal, is similar to a 
product of two Hermitian matrices. This is seen as follows: matrices 


Ge yoy) me (0 coy ) 


are similar, since any square matrix is similar to its transpose. But 


oe cay )= (78 oy) 


which is equal to a product of two Hermitian matrices: 
0 J(X) 0 TL : 
(J(A))* 0 Fo; 


Problems 


1. Let A and B be Hermitian matrices of the same size. Show that 
AB — BA is skew-Hermitian and ABA — BAB is Hermitian. 


2. Let A, B, and C be n x n Hermitian matrices. Prove or disprove 
tr(ABC) = tr(BCA) or tr(ABC) = tr(CBA). 


Is tr(ABC) necessarily real? How about det(ABC)? Eigenvalues? 
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. Let A and B be n x n Hermitian matrices. Show that tr(A*B*) and 


tr(AB)* are real for any positive integer k. 


. Let A and B be n-square Hermitian matrices. Show that the product 


AB is Hermitian if and only if AB = BA. What if A, B are normal? 


. Let A and B be Hermitian matrices of the same size. Show that AB 


and BA are similar. What if A,B are normal? 


. If Ay, A2,---,;An are the eigenvalues of a Hermitian matrix A, what 


are the singular eigenvalues of A? 


. Give in detail the reason why the matrix A may be assumed to be a 


Jordan form in the proof of Theorem 8.6. 


. Let AE M,,. If A* = A*+! for some positive integer k, show that 


trA=trA? =---=trA” 


. Show that for any square complex matrices A and B of the same size 


tr(AB-— BA)=0 and tr(AB-— BA)(AB+ BA) =0. 
Let A and B be Hermitian matrices of the same size. Show that 


2 2 
| tr(AB)| < (tr A?)!/? (tr B2)/? < tr (4+*) 


and (tr(A + B)2)¥? = (tr A2)*/2 4 (tr B2)!/2, 

Let A, B, and C be Hermitian matrices of the same size. Show that 
| tr(ABC)| < | tr(A?B?C?)|1/? 

is not true in general. [Hint: Assume that A is a diagonal matrix.] 


Let A be a square matrix with all eigenvalues real (A is not necessarily 
Hermitian), k of which are nonzero, k > 1. Show that 


Let A, B € M, be Hermitian matrices of positive traces. Show that 


tr(A+ B)? Z trA? tr B? 
tr(A+B)~ trA ' trB’ 
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14. 


15. 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


Consider two or three n x n Hermitian matrices. Prove or disprove 


(a) The determinant of a product of Hermitian matrices is real. 

(b) The trace of a product of Hermitian matrices is real. 

(c) The eigenvalues of a product of Hermitian matrices are real. 
Let A and B be Hermitian matrices of the same size. Show that 


there exists a unitary matrix U such that U* AU and U* BU are both 
diagonal if and only if AB = BA. 


Let A and B be Hermitian matrices of the same size. If AB = BA, 
show that for any a,b € C the eigenvalues of aA+bB are in the form 
aX\+bp, where A and are some eigenvalues of A and B, respectively. 


Let A € M,, and S be an invertible matrix so that S~1AS = A*. Set 
AH. =cS+éS"*. 

Show that H, and AH, are Hermitian matrices. Also show that 
A= (AH,)H;' 

is a product of two Hermitian matrices for some c such that H, is 


invertible. Why does such an invertible matrix H, exist? 


Show that a matrix is diagonalizable (not necessarily unitarily diag- 
onalizable) with real eigenvalues if and only if it can be written as 
a product of a positive definite matrix and a Hermitian matrix. For 
the singular case, is the product of a singular positive semidefinite 
matrix and a Hermitian matrix always diagonalizable? 


Show that any square matrix is a product of two symmetric matrices. 


Let A € M,, be positive definite and let B € M, be Hermitian such 
that AB is a Hermitian matrix. Show that AB is positive definite if 
and only if the eigenvalues of B are all positive. 


Let A € M, be a Hermitian matrix. Show that tr A > 0 if and only 
if A= B+ B* for some B similar to a positive definite matrix. 


Show that a matrix A is a product of two positive semidefinite ma- 
trices if and only if A is similar to a positive semidefinite matrix. 
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8.3. The Min-Max Theorem and Interlacing Theorem 


In this section we use some techniques on vector spaces to derive 
eigenvalue inequalities for Hermitian matrices. The idea is to choose 
vectors in certain subspaces spanned by eigenvectors in order to ob- 
tain the min-max representations that yield the desired inequalities. 

Let H be ann xn Hermitian matrix with (necessarily real) eigen- 
values \;(#7), or simply ;, i = 1,2,...,n. By Theorem 8.1, there is 
a unitary matrix U such that 


U*HU = diag(\1, A2, a3 Sg Miia) 


or 


HU = U diag(Aq, A2, o Lone 


The column vectors uy, Ug,...,Un of U are orthonormal eigenvec- 
tors of H corresponding to A1, A2, ...,An, respectively, that is, 


Hig Mite, Uy HO. Ef Sg ae; (8.4) 


where 6;; = 1 if i= j and 0 otherwise (Kronecker delta). 
We assume that the eigenvalues and singular values of a Hermi- 
tian matrix H are arranged in decreasing order: 


Ames = Al 2g 8 Sy = Aas 


Omax = 01 2 02 2 +++ 2 On = Onin. 


The following theorem is of fundamental importance to the rest 
of this chapter. The idea and result are employed frequently. 


Theorem 8.7 Let H be annxn Hermitian matrix. Let uy, ue, ..., 
Un be orthonormal eigenvectors of H corresponding to the (not nec- 
essarily different) eigenvalues 1, 2,..-,An of H, respectively. Let 
W = Span{up,...,ug}, L<p<q<n. Then for any unit c © W 


Adi) ae Aes Ag): (8.5) 
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Proof. Let 2 = LpUp +--+ + Lgtg. Then by using (8.4) 


Dae =e" (te ye eg Ug) 


= 2° (Apiipttp + +++ + Ag&qtq) 


Ap€pX™ Up + +++ + AgkgE" Ug 
2 
= Aplap|? +e + Xgltql”- 


The inequality follows since x is a unit vector: = |x|? =1. OW 


Theorem 8.8 (Rayleigh—Ritz) Let H ¢ M,, be Hermitian. Then 


Amin( 2) = amma” He 


n=l, 
and 
Anse) = max ve Ax. 


a a=1 


Proof. The eigenvectors of H in (8.4) form an orthonormal basis for 
C”. By (8.5), it is sufficient to observe that 


Amin() =u,Hu, and aAmax(H)=uj tu. I 


Recall the dimension identity (Theorem 1.1 of Section 1.1): If Sy 
and S2 are subspaces of an n-dimensional vector space, then 


dim(S1 M S2) = dim $; + dim S2 — dim(.S1 + S2). 
It follows that 5S; Sg is nonempty if 
dim S; + dim S, >n (8.6) 
and that for three subspaces $1, S2, and S3, 
dim($1 9 S2M $3) > dim S; + dim S2 + dim S3 — 2n. (8.7) 


We use these inequalities to obtain the min-max theorem and 
derive eigenvalue inequalities for Hermitian matrices. 
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Theorem 8.9 (Courant—Fischer) Let H € M,, be Hermitian and 
let S represent a subspace of C” (of complex column vectors). Then 


QA} = eg As 


max min 
(dim S=k) (#€S,x*x=1) 
= max min x He 
(dim S=n—k) (we St, a2*x=1) 


min max «Hx 
(dim S=n—k+1) (ES, x*x=1) 


= min max x He. 
(dim S=k—1) (we S+,x2*x=1) 


Proof. We show the first max-min representation. The second one 
follows from the first immediately as dim St = k if dimS = n—k. 
The rest of the min-max representations are proven similarly. Let u; 
be orthonormal eigenvectors belonging to Aj, 7 = 1,2...,n. We set 


Si; = Span{ug,...,un}, dimS;=n—k+1, 


and let Sj = S be any k-dimensional subspace of C”. By (8.6), there 
exists a vector x such that 7 € S,M So, x*x” = 1, and for this x, by 
(8.5), A, > z* Hx. Thus, for any k-dimensional subspace S of C”, 


Ap > min  v*Hv. 
vES, v*¥v=1 


It follows that 


Ak > max min v Hv. 
(dim S=k) (vES, v*v=1) 


However, for any unit vector v € Span{uj,u2,...,uz}, which has 
dimension k, we have, by (8.5) again, v*Hvu > A, and upHug = Ax. 
Thus, for S = Span{u;,...,ug}, we have 


min v'Hvu> dg. 
vES, v*v=1 


It follows that 


max min v Hu> Xx. 
(dim S=k) (vES, v*v=1) 
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Putting these together, 


max min v'Hvu=X,. 
(dim S=k) (vES, v*v=1) 


The following theorem is usually referred to as the eigenvalue 
interlacing theorem, also known as the Cauchy, Poincaré, or Sturm 
interlacing theorem. It states, simply put, that the eigenvalues of 
a principal submatrix of a Hermitian matrix interlace those of the 
underlying matrix. This is used to obtain many matrix inequalities. 


Theorem 8.10 (Eigenvalue Interlacing Theorem) Let H be an 
nxn Hermitian matrix partitioned as 


A B 
u=(5 6) 


where A is anm xm principal submatrix of H, 1 <m<n. Then 
Neel) S Ag(A) SARE) B= 1y2)..059. 
In particular, when m =n -—1, 
An(H) < An-1(A) < An-1(H) < +++ < A2(A) < A1 (A) < A1(4#). 


We present three different proofs in the following. For conve- 
nience, we denote the eigenvalues of H and A, respectively, by 


AL Ae 2 eS Nyy Ps hee eS ime 


Proof 1. Use subspaces spanned by certain eigenvectors. Let u; 
and v; be orthonormal eigenvectors of H and A belonging to the 
eigenvalues A; and j4;, respectively. Symbolically, 


* be nm 

Au; = Aiui, Uz Uz = O55, i,j =1,2,...,, uEC ’ 
* Caer: m 

Av; = [4Vj, U; Uj = bij LjI= 1,2,. 225M, UE E C ry 


where 6;; = 1 if 2 = 7 and 0 otherwise. Let 


w=(G ec i=1,2,...,m. 
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Note that the w; are eigenvectors belonging to the respective eigen- 
values 4; of the partitioned matrix A @0. For 1 < k < m, we set 


Sy = Span up,. ++; ty} 
and 
So = Span{wi,..., we}. 


Then 
dim S$; =n—k+1 and dimS,=k. 


We thus have a vector x € $1 $9, «*x = 1, and for this x, by (8.5), 
Ne Ae pie: (8.8) 


An application of this inequality to —H gives pup > Agin—m- 


Proof 2. Use the adjoint matrix and continuity of functions. Reduce 
the proof to the case m = n— 1 by considering a sequence of leading 
principal submatrices, two consecutive ones differing in size by one. 

We may assume that A; > Ag >--- > Ay. The case in which some 
of the A; are equal follows from a continuity argument (replacing 4; 
with A; + «;). Let U be an n-square unitary matrix such that 


H=U* diag(A1, ra, drasecy Awl 


Then 
tI — H =U" diag(t — 4, t— Ao,...,t-—An)U (8.9) 


and fort £ A;, i= 1,2,...,n, 
adj(t — H) = det(tI — H)(tI — H)~'. (8.10) 
Upon computation, the (n,n)-entry of (tf — H)~! by using (8.9) is 


|uinl? |w2n|? fot al [tieaiel 
t—Wy fae” oe 


and the (n,n)-entry of adj(tl — H) is det(t — A). Thus by (8.10) 


det(tl— A) — |uin|? _ |wanl? = litnal® 


= irs 3 = . 11 
a 8) fo ea ey 
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Notice that the function of t defined in (8.11) is continuous except 
at the points A;, and that it is decreasing on each interval (Aj+1, i). 
On the other hand, since 11, l2,..., fn are the roots of the numerator 
det(tI — A), by considering the behavior of the function over the 
intervals divided by the eigenvalues \,;, it follows that 


fos © [Aigi, Ai], 2=1,2,...,n—1. 


The preceding proof is surely a good example of applications of 
calculus to linear algebra and matrix theory. 


Proof 3. Use the Courant—Fischer theorem. Let 1 < k < m. Then 


A,(A) = max min x" Ax, 
Sk (xeSk , x*2=1) 


where S* is an arbitrary k-dimensional subspace of C™, and 


Ay(H)=max min 2° Hz, 
Sk (xESk, x*x=1) 


where S* is an arbitrary k-dimensional subspace of C”. 
Denote by ae the k-dimensional subspace of C” of the vectors 


v=(9): where x € Sf. 


Noticing that y* Hy = x* Ax, we have, by a simple computation, 


(2) = tnax min 2° He 
Sk (xESk, x*x=1) 
> max min y Hy 
8§ (yeSg, y*y=1) 
= max min xv Ax 
Sk (xeSk , x*x=1) 
= ,(A). 


The other inequality is obtained by replacing H with —H. U& 


As an application of the interlacing theorem, we present a result 
due to Poincaré: if A € M, is a Hermitian matrix, then for any 
m xn matrix V satisfying V*V = J, and for each i = 1,2,...,m, 


Nine nl A) < DV EAV) <A): (8.12) 
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To see this, first notice that V*V = I, > m > n (Problem 1). 
Let U be an m x (m—7n) matrix such that (V,U) is unitary. Then 


(V.U)"A(V.U) = ( V*AV V*AU ) | 


U*AV U*AU 


Thus by applying the interlacing theorem, we have 


Atm—n(A) = Aitm—n((V,U)*A(V,U)) 
< Ai(V"AV) 
< d((V,U)A(V,0)) 
= ,(A). 


Problems 


1. Let V be an m x n matrix. Show that if V*V = /,, then m > n. 


2. Let [A] be a principal submatrix of A. If A is Hermitian, show that 


Amin (A) < Amin ([A]) < Amax([A]) < Amax(A). 


3. Let \,(A) denote the Ath largest eigenvalue of an n-square positive 
definite matrix A. Show that 


1 


A= ay’ 


4. Let A € M,, be Hermitian. Show that for every nonzero x € C”, 


xv* Ax 


Baal 


Amin (A) < 


< Amax (A) 


and for all diagonal entries a;; of A, 
Amin (A) < aii < Amax(A). 
5. For any Hermitian matrices A and B of the same size, show that 


Amax(A —_ B) =P Amin(B) & Amax(A). 
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6. 


10. 


11. 


12. 


Let A € M, be Hermitian and B € M,, be positive definite. Show 
that the eigenvalues of AB~! are all real, that 


and that 


. Let A be an n x n Hermitian matrix and X be an n x p matrix such 


that X*X = I,. Then 


i=n—pt+1 {=1 
and 
Pp n 
So Ap (A) < tr(X*AX) + < SO X;(A) 
i=l t=n—pt+1 


. Let A be an n-square positive semidefinite matrix. If V is ann x m 


complex matrix such that V*V = I[,,, show that 


m 


I An—m+i(A) < det(V*AV) < [[A;(4). 


i=l 


. Let A be a positive semidefinite matrix partitioned as 


Ay At ) 
A = 
( Ag, Ag }’ 
where Aj; is square, and let Ait = Ago — Ao Ay} Ais be the Schur 
complement of Ay; in A when Aj; is nonsingular. Show that 


Amin(A) < Newinl Ava) < Amin (A22). 
Let H = ( 2 ,) be Hermitian, where X is the k xk leading principal 


submatrix of H. If \;(H) = \;(X), 7 = 1,2,...,k, show that Z = 0. 


Let A be an n-square matrix and U be an n x k matrix, k <n, such 
that U*U = I,. Show that 0;(U* AU) < 0;(A), i =1,2,...,k. 


Show the min-max representations in Theorem 8.9. 


© 
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8.4 Eigenvalue and Singular Value Inequalities 


This section presents some basic eigenvalue and singular value in- 
equalities by using the min-max representation theorem and the 
eigenvalue interlacing theorem. We assume that the eigenvalues 
Ai(A), singular values o;(A), and diagonal entries d;(A) of a Her- 
mitian matrix A are arranged in decreasing order. 

The following theorem on comparing two Hermitian matrices best 
characterizes the Lowner ordering in terms of eigenvalues. 


Theorem 8.11 Let A, BEM, be Hermitian matrices. Then 
ASB 2 AVS AB), tS 12, cn cy ht 
This follows from the Courant—Fischer theorem immediately, for 
A>B = wAxr>x* Bar, rEC”". 


Our next theorem compares the eigenvalues of sum, ordinary, and 
Hadamard products of matrices to those of the individual matrices. 


Theorem 8.12 Let A, BEM, be Hermitian matrices. Then 
1. (A) + An(B) < AV(A+ B) < A(A) + A1(B). 
2. Ai(A)An(B) < Ai(AB) < Ay(A)A1(B) if A> 0 and B > 0. 
3. di(A)An(B) < \i(Ao B) < di(A)A1(B) if A> 0 and B> 0. 
Proof. Let x be unit vectors in C”. Then we have 
ge Ag + min ge Ba<a*(A+B)z < x*Ar+ max «" Bar. 


Thus 
g* Az + An(B) < a2*(At+ Bj < 2* Ax t+ Aq(B). 


An application of the min-max theorem results in (1). For (2), we 
write \;(AB) = \;(B/?AB/?). Notice that \;(A)I — A > 0. Thus 


BY? AB? < BY? 4B’? + BY2(),(A)I — A)BY? = di (ADB. 
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An application of Theorem 8.11 gives (2). For (3), recall that the 
Hadamard product of two positive semidefinite matrices is positive 
semidefinite (Schur theorem, Section 7.5). Since A1(B)I—B > 0 and 
B-— X,(B)I = 0, by taking the Hadamard product with A, we have 


Ao(\(B)I-B)>0, Ao(B-n(B)I) > 0, 
which reveals 
An(B)(Lo A) < AoB< 4(B)(Io A). 
Note that Io A = diag(ai1,...,@nn). Theorem 8.11 gives for each 2, 
d;(A)An(B) < Ai(Ao B) < di(A)Ai(B). Ol 
It is natural to ask the question: if A > 0 and B > 0, is 
Ni(A)An(B) < (A 0 B) < 4{(A)A1(B)? 
The answer is negative (Problem 8). For singular values, we have 
Theorem 8.13 Let A and B be complex matrices. Then 
o;(A) + on(B) < 0;(A+ B) < o;(A) + 01 (B) (8.13) 
if A and B are of the same size m x n, and 
0;(A)om(B) < o;(AB) < o;(A)o1(B) (8.14) 
if Aismxn and B isn xm. 
Proof. To show (8.13), notice that the Hermitian matrix 
(eo) 
AP Mf? 


where X is an m X nm matrix with rank r, has eigenvalues 


o1(X), ..., or(X), 0,...,0, —op(X), ..., —o1(X). 
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Applying the previous theorem to the Hermitian matrices 


(ao) Ceo) 


and to their sum, one gets the desired singular value inequalities. 
For the inequalities on product, it suffices to note that 


o;(AB) = \/);(B*A* AB) = \/);(A*ABB*). OW 


Some stronger results can be obtained by using the min-max the- 
orem (Problem 18). For submatrices of a general matrix, we have 
the following inequalities on the singular values. 


Theorem 8.14 Let B be any submatrix of anm x n matrix A ob- 
tained by deleting s rows and t columns, s+t=r. Then 


Opda(A) = 6,(8) SafA), 1— 1,2). ..,minfinyn}. 


Proof. We may assume that r = 1 and B is obtained from A by 
deleting a column, say b; that is, A = (B,b). Otherwise one may 
place B in the upper-left corner of A by permutation and consider a 
sequence of submatrices of A that contain B, two consecutive ones 
differing by a row or column (see the second proof of Theorem 8.10). 
Notice that B*B is a principal submatrix of A*A. Using the 
eigenvalue interlacing theorem (Theorem 8.10), we have for each 7, 


Ni41(A*A) < Ai(B*B) < \i(A*A). 


The proof is completed by taking square roots. 

Theorem 8.12 can be generalized to eigenvalues with indices r and 
s,r+s <n-—1. Asan example, we present two such inequalities, one 
for the sum of Hermitian matrices and one for the product of positive 
semidefinite matrices, and leave others to the reader (Problem 18). 


Theorem 8.15 Let A and B be n-square Hermitian matrices. If r 
and s are nonnegative integers such thatr+s<n-—1, then 


Arts+1(A + B) s rr+1(A) + As+1(B). 
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Proof. Let u;, vj, and w;, 7 = 1,2,...,n, be orthonormal eigenvectors 
corresponding to the eigenvalues ;(A), A;(B), and \;(A+ B), of the 
matrices A, B, and A+ B, respectively. Let 


5) SMe Uisisaces kee gins dim S$; = n—r, 
Sg = Spatt{ Veiiyeesy. ey Sq, dim S55 =n-—s, 
S3 = Span{w},...,---,Wrtsti}, dimS3=r+s+1. 


Then, by Problem 23 of Section 1.1, 
dim(S1 N S23) > dim S; + dim S2 + dim $3 — 2n = 1. 
Thus, there exists a nonzero unit vector x € $1 S29 S3. By (8.5), 
Ar+s+1(A+B) < 2*(A+B)ax = x* Art+a* Ba < Ap41(A)+As41(B). I 
Setting r = 0 in the theorem, we have for any 1 <k <n, 

AR(A + B) < Ar(A) + Ax (B). 

Applying the theorem to —A and —B, one gets 
An—p-s( A+ B) > Ag A) + Ana(2). 


Theorem 8.16 Let G and H benxn positive semidefinite matrices. 
Ifr and s are nonnegative integers such thatr +s <n-—1, then 


Ar+st+i(GH) = Ar+1(G) As+1 (H). 


Proof. We may assume that both G and H are positive definite; 
otherwise use a continuity argument. Let u; and vu; be the orthonor- 
mal eigenvectors corresponding to ;(G) and ;(H), respectively, 
i =1,...,n. Let W, be the subspace of C” spanned by wy,..., Up 
and W2 be the subspace spanned by v1,..., Us. 

Let W3 = Span{H—'/2u,..., H~/2u,}. Then dim(W2 + W3) < 
r+ s. Let S be a subspace of dimension r + s containing W2 + W3. 
Then S+ C (W2 + W3)+ = WM W3. It follows that 
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Arte+1 (HGH?) 


min max a* (H1?GHV?)z 
(dimW=r+s) (xeW+, «*x=1) 


max 2*(H/?GH/?)a 


xESt,x*x=1 


a*(H'V/?2GH'2)r 
max ; -a Ha 
rEeWsnWws, 2*r=1 xv*Ha 


a*(H'V/?2GH1/2)r : 
max 7 . max x He 
reWs; ,2*2=1 u* Aas ceWs, a*x=1 

- 

yGy 

max —_— - max xv He 

yews YY wceWdt,a*x=1 


drgi(G)Asui(H). 


1. Use Theorem 8.11 to show that A > B > 0 implies 


rank (A) >rank(B), det A> det B. 


2. Let A> 0, B > 0. Prove or disprove \;(A) > 4;(B) > A> B. 
3. Show that A> B = ),(A) > A; (B) for each i. 
4. Let A> B>0. Show that for X > 0 of the same size as A and B 


\i(AX) > Ai(BX), for each i. 


5. Let A € M, be a positive semidefinite matrix. Show that 


Ai(A)I-A>0>A(A)I-A, ie, An(A)I < AX< A (ADI. 


6. Let A and B be n x n positive semidefinite matrices. Show that 


An(A) + 


An(A i n(B) 
An(A)An(B) < An(Ao B) < 


<An(A+ B) < AV(A4t B) < \1(A) + Ai (B); 
< An(AB) < A1(AB) < A1(A)A1(B); 
Ai(Ao B) < \1(A)A1(B). 


Sec. 8.4 Eigenvalue and Singular Value Inequalities 279 


7. 


10. 


11. 


12. 


13. 


14. 


Show by example that for A > 0 and B > 0 of the same size the 
following inequalities may both occur. 


. Show by example that for A > 0 and B > 0 of the same size the 


following inequalities do not hold in general. 


Ni(A)An(B) < Ai(Ao B) < Ai(A)A1(B). 


. Let A, Be M, be Hermitian matrices. Prove or disprove 


di(A +.B) < Ai(A) + G(B). 
Let A, B € M, be Hermitian matrices. Show that for any a € [0, 1], 
Amin(@A + (1 — a)B) > @Amin(A) + (1 — @)Amin(B) 
and 
Amax(@A + (1 — a)B) < @Amax(A) + (1 — @)Armax(B). 


Let A be Hermitian and B > 0 be of the same size. Show that 
Amin(B)Ai(A?) < Ai(ABA) < Amax(B)Ai(A”). 
Let A € M,, be Hermitian. Show that for any row vector u € C”, 
Nizo(A + u*u) < AiG (A) < A(A+ u*u) 


and 
Ai+2(A) < Ni-1(A + u*u) < di(A). 


When does d(A) = A(A)? In other words, for what matrices are the 
diagonal entries equal to the eigenvalues? How about 


Let A be an n x n positive semidefinite matrix. If V is an n x m 
matrix so that V*V = diag(01, 62,...,4m), each 6; > 0, show that 


An(A) mind; < Ai(V* AV) < A1(A) max Oj. 
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15. Let A and B be n-square complex matrices. Show that 
On(A)on(B) < |M(AB)| < o1(A)oi(B) 

for any eigenvalue \(AB) of AB. In particular, |A(A)| < omax(A). 

16. Let A, X,B be m x p, p x q, ¢ X n matrices, respectively. Show that 
oi(AXB) < 01(A)oi(X)o1(B) for every i < min{m, p, qg, n}. 
17. Let X and Y be n x n matrices, t € [0,1], and ¢ = 1— t. Show that 
o(tX +tY) <a;(X BY). 

[Hint: Consider (al, 81)(X 6 Y)(al, BI)? and use Problem 16.] 

18. Let A and B be nxn matrices and r+s < n—1. Show the following. 


(a) If A and B are Hermitian, then 
Anep—a(A + B) > An—-(A) + An—a(B). 
(b) If A and B are positive semidefinite, then 
An—r—s(AB) > An—r(A)An—s(B). 
(c) For singular values, 
Or4s41(A t+ B) < ogi (A) + o541(B), 


Or4s41(AB) < o741(A)os41(B), 


and 
Cap AB) > Oy; (Alea a(4); 


but it is false that 


On—r—s(A + B) > on—r(A) + on—s(B). 
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8.5 Eigenvalues of Hermitian Matrices A, B, and 4+ B 


This section is devoted to the relationship between the eigenvalues 
of Hermitian matrices A and B and those of the sum A+ B. It has 
been evident that min-max representations play an important role in 
the study. We start with a simple, but neat result of min-max type. 


Theorem 8.17 (Fan) Let H be annxn Hermitian matrix. Denote 


by Sp a set of any k orthonormal vectors 1,%2,...,¢%% € C”. Then 
k k 
D(H) = max) | a} Ha, lake (8.15) 
i=1 t=1 
Proof. Let U = (V,W) be a unitary matrix, where V consists of the 
orthonormal vectors 41, 22,...,2%. Then by (8.12) 
k k k 
So ef Ha; = tr(V*HV) = S~d(V*HV) < 50 (HH). 
i=1 i=1 i=1 


Identity (8.15) follows by choosing the unit eigenvectors x; of A;(H): 


k k 
S stHa= > A(H) Oe 
i=1 = 1. 


Our main result of this section is the following theorem, which 
results in a number of majorization inequalities (Chapter 10). 


Theorem 8.18 (Thompson) Let A and B be nxn Hermitian ma- 
trices and let C= A+B. Ifa, >--- > Qn, P1 > +--+: = Bn, and 
V1 >:+++ >In are the eigenvalues of A, B, and C, respectively, then 
for any sequence 1 < 14 < +++ <ip <n, 


k k k k k 
.y Oi, + Pike S ee < S- 4, + Ss" Bt. (8.16) 
t=1 t=1 t=) i=] t=1 
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This theorem is proved by using the following min-max expression 
for the sum of eigenvalues that is in turn shown via a few lemmas. 


Theorem 8.19 Let H be ann x n Hermitian matrix and let W; 
represent a subspace of C". Then, for 1 <i, <-+++ <ip <n, 


k k 
‘ — j * 

y (Hh) = Pe aa ys ue, AT, (8.17) 

t=1 t=1 
where #1,...,X~ (in Wi,...,We, respectively) are orthonormal. 
Lemma 8.1 Let U;,...,U% be subspaces of an inner product space 
such that dimU; > t, t = 1,...,k. If ue,...,ug are linearly inde- 
pendent vectors in Us,...,Ux, respectively, then there exist linearly 
independent vectors y1, Y2,.--,YR in Uy, Uo,...,Up%, respectively, such 
that Span{ug,...,uz¢} C Span{y1, y2,.--, ye}. If, in addition, Uy C 
--+ CUR, then y1, yo,---,YR can be chosen to be orthonormal. 


Proof. We use induction on k. Let k = 2 and uz € U2 be nonzero. 
If ug € Uy, then we set y; = u2 and take y2 from U2 so that ye is not 
a multiple of ug (ie., y1 and y2 are linearly independent). This is 
possible because dim U2 > 2. If ug ¢ U1, there must exist a nonzero 
vector y; € U; that is not a multiple of ug. Now set yo = u2. y, and 
y2 are linearly independent. In either case, Span{u2} C Span{y;, yo}. 

Now suppose it is true for the case of & — 1; that is, given 
U2,-..,UkK—1, there exist linearly independent y1, yo,...,y,—1 so that 
the span of the us is contained in the span of the ys. We show the 
case of k. If uy ¢ Span{y1, y2,---,Yr—1}, then we take y, = ug. 
Otherwise, we take a nonzero yz, € Up M (Spanfyt, yo,.--,Ye—1})+- 
In any case, we have Span{uz2,...,ug} C Span{y1, y2,---, yr}, where 
yz © U; are linearly independent, t = 1,...,k. Due to the Gram— 
Schmidt orthonormalization (Problem 21, Section 1.4), yi, y2,.--,Yk 
can be chosen to be orthonormal. UH 

By putting Q; = Ux_+11 for each t in the above lemma, we ob- 
tain an equivalent statement to the lemma: if Q; are k subspaces of 
an inner product space, dimQ; > k —t+ 1, and if uy; are linearly 
independent vectors, where 1% € Q:, t=1,...,k—1, there exist lin- 
early independent vectors yz, where y; € Q:, t = 1,...,k, such that 
Span{v1,...,¥~-1} C Span{yi, y2,---, yr}. 


Sec. 8.5 Eigenvalues of Hermitian Matrices A, B, and A+B 283 


Lemma 8.2 Let 1 < i) < +++ < ip < n. Let Wi,...,We and 
Vi,..., Ve be subspaces of an inner product space such that dim W; > 
i¢4, dimV; >n—-—% +1, t=1,...,k, and Vj D--- D Vy. Then there 
exist orthonormal x, © Wi, t = 1,...,k, and orthonormal yz, © Vi, 


t= 1,...,k, such that Span{z,...,%,} = Span{yi,.--5 4%}: 


Proof. Since dimW; + dimV, = n+ 1, by the dimension identity 
(Theorem 1.1), WiNV, #@. A unit vector 7; = y; € Wi NV, exists. 
So the conclusion is true when & = 1. Suppose it is true for k — 1; 
that is, there exist orthonormal 2; € W; andy, € Yj, t=1,...,k—1, 
such that Span{z1,...,2,-1} = Span{vj,...,v~—-1}. Now let 


W =Span{2,...,¢p-1} + We (Spanfay,...,24-1})+. (8.18) 


Then dimW > dim WW; > iz (Problem 3). Note that i, —i, >k—t; 
we have dimW + dimV, > 7p, +n -% +1 >n+k-t4+1. 

Set Q =WOV, t =1,...,&. By the dimension identity again, 
dimQ@:; > k —t+1. Moreover, Q; > --- D Qx, and uy € Qi, t = 
1,...,k —1. By the discussion following Lemma 8.1, there exist 
orthonormal y% € Q:, ¢ = 1,...,k, such that Span{z1,...,2,-1} = 
Span{v1,...,U~-1} C Span{yi,...,y,} GC W. Choose a unit vector 
az, from the space (Span{2x1,...,2,-1})+NSpan{y,..., yr} £0. By 
(8.18), a, € Wy and Span{z1,...,2,} =Span{yi,...,yn}. i 


Lemma 8.3 Let H be ann x n Hermitian matriz and let 1 < i < 
soe Sap <n. Then for any subspaces W1,...,Wz of C” such that 
dimW; = i%, t = 1,...,k, there exist orthonormal x, © Wi, t = 
1,...,k, such that 


k k 
So aj Ha, < So As, (HH). 
t=1 t=1 
Proof. Let Vi = Span{ui,,Ui,t1,.--,Un}, where uj; are the or- 


thonormal eigenvectors of H belonging to the eigenvalues Aj, re- 
spectively. Then dimV; = n—%+1,t=1,...,k, and Vj D--- D Vp. 
By Lemma 8.2, there exist orthonormal 21,...,7, and orthonor- 
mal y1,...,Ye, where x ©€ W; and y% ©€ VY, t = 1,...,k, such 
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that Span{z1,...,2,} = Span{yi,..., yx}. By Problem 4 and Theo- 
rem 8.7, we have 


k 


k k 
dD tiHe, = Sl utHy <> Xiu 
t=1 t=1 


Now we are ready to prove Theorem 8.19. 


Proof of Theorem 8.19. Let W; = Span{u1, u2,...,ui,}, where u; 
are the orthonormal eigenvectors of H belonging to the eigenvalues 
Aj, respectively. (This is possible because H is Hermitian.) Then 
dimW; = i,t =1,...,k. For any unit vectors 7, € W;, we have 


k k 


dt Ha 2 ee (#7). 


t=1 


Combining Lemma 8.3, we accomplish the proof. 


Proof of Theorem 8.18. By Theorem 8.19, we may assume that 
oy i, 18 attained by the subspaces S;, dim S, = i, t = 1,...,k, 
with orthonormal y; € S;,t=1,...,k. Then Lemma 8.3 ensures 


k 


k 
dv Ayt S 2 


However, Theorem 8.17 yields 2 yp By: < pe Ai(B). Therefore, 


k k 


k k 
De = DvlA+ Ble S Do Au(A) + DMB 
t=1 t=1 


This is the inequality on the right-hand side of (8.16). For the first 
inequality, let D = —B and denote the eigenvalues of D by 6, > 
- > dn. Then A= C+D. The inequality we just proved reveals 


k k k 
dH S Dime t+ Doh 
=i t=1 t=1 


Note that 6; = —6,-141, (= 1,...,n. The inequality follows. I 
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Problems 


1. Let A € M,, be a Hermitian matrix with eigenvalues ordered |\,(A)| > 
- > |Ap(A)|. Show that |A,(A)| = max,*,—) |v* Ax|. However, 
|An(A)| = minz«,=1 |2* Az| is not true in general. 


2. Let A be an m x n matrix. For any index 7, show that 


ai(A) = max min *(A*A)'/? 
(dim W=i) (EW, «*x=1) 

max min  (2*A*Ax)*/?, 
(dim W=i) («EW, «*x=1) 


3. Let V and S be subspaces of an inner product space of finite dimen- 
sion. Let W = $+ StnV. Show that dimW > dimV. 


4. Let {1,...,0%m} and {y1,.--,Y%m} be orthonormal sets in C” so that 


Span{1, eas stm} = Span{y1, ae) Urn}: 


Show that for any n-square complex matrix A, 
m m 
» xp Any = x. y; Aut. 
t=1 t=1 


5. Let H be an n x n Hermitian matrix. For any 1 < k < n, show that 
k k 
So An-it1(H) = pimin, tr U*HU = min 2, a* Ha, 


i=l mpi Oig 
where 6;; = 1 if i = 7 and 0 if i F 7. 
6. Let H € M,, be positive semidefinite. For any 1 < k < n, show that 


\i(H) = max detU*HU = max det(2;H2;), 
U*U=Ik ee 


et ns=bi5 
where 6;; = 1ifi=j and 0 ifi 4 Jj, and 


k 
[] eau) = min detU*HU= min det(x;Ha;). 
par U*U=h;, 


uta =0i; 
7. Let A and B ben xn Hermitian matrices and B > 0. Let py > po > 
+++ > Uy, be the eigenvalues of det(A — 1B) = 0. Show that 


v* Ax . “x Ax 
[in = min : 
eo Ba’ '" «40 x* Ba 


1 = max 
ss «zA0 
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Show more generally that for 1 <<k <n, 


k k 

os xs Ax; 
hi = max y 

= x} Bauj=0, iAj x Ba; 
= = 


and 


n k 
I eae 
— mi n ‘“ 
Ps «* Ba;=0, Aas a Be; 


i=n—k+1 u; 


. Let A be an m x n complex matrix. For any 1 < k <n, show that 


ou = max x, (det(U*A* AU)? 


and 


[[on-t1(4) = piin, (det(U* A*AU))/?, 


. Let A be an m x n complex matrix. For any 1 < k <n, show that 


ee [UAV] =. pmax, _, Re(tr(UAV)). 
f=] ~ ye U= me, U*U=V*V=Ik 


However, neither of the following holds: 


don-iti(A) = ,,,min | |tr(UAV)); 
Pe On—it1(A) = se oe Re(tr(UAV)). 


Let A and B be n x n Hermitian matrices such that tr(A + B)* = 
tr(A*) + tr(B*) for all positive integers k. Show that 

(a) rank(A + B) = rank A + rank B. 

(b) In(A+ B) =ImA+ImB. 

(c) AB=0. 
[Hint: For (a), use Problem 7 of Section 5.4. For (c), view A and B 


as linear transformations on Im(A + B). Find the matrix of A+ B 
on Im(A+ B) and use the equality case of the Hadamard inequality.] 


© 


Sec. 8.6 A Triangle Inequality for the Matrix (A*A)!/? 287 


8.6 A Triangle Inequality for the Matrix (A*A)!/? 


This section studies the positive semidefinite matrix (A*A)!/?, de- 
noted by |A|, where A is any complex matrix. We call |A| the modulus 
of matrix A. The main result is that for any square matrices A and 
B of the same size, there exist unitary matrices U and V such that 


|A+ B|) <U*|AJU+ V*|BV. 
As is known (Theorem 8.11), for Hermitian matrices A and B, 
A>B = Xj(A) = Ai(B). 


Our first observation is on the converse of the statement. The in- 
equalities \;(A) > A;(B) for all 7 cannot ensure A > B. For example, 


2(38). 29(3 8) 


Then 41(A) = 3 > A1(B) = 3 and A2(A) = 2 > r2(B) = 1. But 
A— B, having a negative eigenvalue —1, is obviously not positive 
semidefinite. We have, however, the following result. 


Theorem 8.20 Let A, BEM, be Hermitian matrices. If 
i(A) = Ai(B) 
for alli =1,2,...,n, then there exists a unitary matrix U such that 
U* AU > B. 
Proof. Let P and Q be unitary matrices such that 
A= P* diag(Ai(A),...-;An(A))P, 


B = Q*diag(\u(B),-..,An(B))Q. 
The condition \;(A) > A;(B), i=1,...,n, implies that 


diag(A1(A),...,An(A)) — diag(A1(B),..., An(B)) > 0. 
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Multiply both sides by Q* from the left and Q from the right to get 
Q* diag(M(A),...,n(A))Q — B > 0. 


Take U = P*Q. Then we have U* AU — B > 0, as desired. 


We now turn our attention to the positive semidefinite matrix | A]. 
Note that |A| is the unique positive semidefinite matrix satisfying 


|A? = A*A. 
Moreover, for any complex matrix A, 
A=U|A| (8.19) 


is a polar decomposition of A, where U is some unitary matrix. Be- 
cause of such a relation between |A| and A, matrix |A| has drawn 
much attention, and many interesting results have been obtained. 

Note that the eigenvalues of |A| are the square roots of the eigen- 
values of A* A; namely, the singular values of A. In symbols, 


A([Al) = (A(A*A))'? = 0A). 
In addition, if A = UDV is a singular value decomposition of A, then 
|A|=V*DV and |A*|=UDU"*. 


To prove Thompson’s matrix triangle inequality, two theorems 
are needed. They are of interest in their own right. 


Theorem 8.21 Let A be an n-square complex matrix. Then 


a (4 ;*) SZ AAIAl,. GPU seca. 


Proof. Take v1, v2,...,Un and wy1, wW2,..., Wy to be orthonormal sets 
of eigenvectors of At and A*A, respectively. Then for each i, 


(FS ). 7 (AF =) (A*A)w; = (Ai(/Al))° wy. 
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For each fixed positive integer k with 1 < k < n, let 
S; = Span{v1,v2,..-,u-}, So = Span{w,,-...,wn}- 


Then for some unit vector x € S$; 52, by Theorem 8.7, 
A*+A A* +A 
o(EpA)ean(P4 


x*(A*A)x < (Ax(|Al))”. 


and 


By the Cauchy—Schwarz inequality, we have 


A*+A A*+A 
WS) « #(34) 
= Re(x2* Az) 
\x* Ax| 
Vu* A* Ax 
Ax (\Al). 


IN IA IA 


Combining Theorems 8.20 and 8.21, we see that for any n-square 
complex matrix A there exists a unitary matrix U such that 


At+A 
- < U*|AlU. (8.20) 


We are now ready to present the matrix triangle inequality. 


Theorem 8.22 (Thompson) For any square complex matrices A 
and B of the same size, unitary matrices U and V exist such that 


|A+ B) <U*|AJU+V*|BIV. 
Proof. By the polar decomposition (8.19), we may write 


A+B=W|A+B\, 
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where W is a unitary matrix. By (8.20), for some unitary U, V, 


|A+B W*(A +B) 


5(Wwe(A + B)+(A+B)'W) 


1 1 
= 5(4;W+W"A) + 5(BW + WB) 
< U*|W*A|U +V*|W*BIV 

= U*|AIU+V*|B\V. & 


Note that Theorem 8.22 would be false without the presence of 
the unitary matrices U and V (Problem 16). 


Problems 


1. Show that as is Hermitian for any n x n matrix A and that 


A*+A 


>0 <= Re(a*Ax)>0 for all ae C”. 


2. Show that A*A > 0 for any matrix A. What is the rank of (A* A)!/?? 
3. Let A be a square complex matrix and |A| = (A*A)!/?. Show that 
(a) A is positive semidefinite if and only if |A| = A. 
(b) A is normal if and only if |A*| = |Al. 
(c) |A| and |A*| are similar. 
) 


(d) If A = PU is a polar decomposition of A, where P > 0 and U 
is unitary, then |A| = U* PU and |A*| = P. 


4. Find |A| and |A*| for each of the following matrices A. 


01 i 4 11 1 1 
OO) LO. F Na ale ed ot 
5. Find |A| and |A*| for A = (54), where a > 0. 
6. Let M = ({4)). Show that |M| = ('4!,2.)). 


7. Find |A| for the normal matrix A € M,, with spectral decomposition 


A=Udiag(\1,...,An)U* = So Asal, where U = (u,...,Un). 


i=1 
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10. 


11. 


12. 


13. 


14. 


15. 
16. 


17. 


18. 


. Let A be an m x n complex matrix. Show that |UAV| = V*|A|V for 


any unitary matrices U € M,, and V € M,. 


. Let Ac M,. Then A and U*AU have the same eigenvalues for any 


unitary U € M,,. Show that A and UAV have the same singular val- 
ues for any unitary U,V € M,,. Do they have the same eigenvalues? 


Let A be a Hermitian matrix. If X is a Hermitian matrix commuting 
with A such that A < X and —A < X, show that |A| < X. 


Show that for any matrix A there exist matrices X and Y such that 
A=(AA*)X and A=(AA*)*/?Y. 


Let A be an m x n complex matrix, m > n. Show that there exists 
an m-square unitary matrix U such that 


au { AA 0 
Aa’ = 0° ( ; 5 )u 


Show that for any complex matrices A and B (of any sizes) 
|A@ B| = |A] @ |B. 
Show that for any unit column vector « € C" and A € M,, 
|2* Az|? < 2*|Al?z. 
Show that |A + A*| <|A|+|A* for all normal matrices A. 
Construct an example showing it is not true that for A, B €¢ M, 
|A+ B| < |A] + |B}. 
Show that the trace inequality, however, holds: 
tr|A + B| < tr({|A] + |B). 
Let A, BE My, p, g> 1, 1/p +1/q = 1. Show that 
|A— BP + |V/p/q A+ V/a/p BI’ = iAP + a] BP. 


Let A, BE My, a, b> 0, cE R, ab > c?. Show that 


alA|? + b|B|? + c(A*B + B*A) > 0. 
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20. 


21. 


22. 


23. 


24. 
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Let A,,---,Axp € Mn, ay,...,a, > 0, ay +--+ + az = 1. Show that 

|ayAy +--+ + ap Axl? < a|A? +--+ + ag] Apl?. 
Let A, B, C, DE My. If CC* + DD* < I, show that 

ICA + DB| < (A)? + |BP)”?. 

Let A and B be n-square Hermitian matrices. Show that 

(EP )o0 + (iM Joo 
Let A be ann xn Hermitian matrix with the largest eigenvalue Amax 
and the smallest eigenvalue Amin. Show that 

Amax — Amin = 2max{|x2*Ay|: x, y € C” orthonormal}. 


Derive that 
Amax — Amin = zane |ax,|. 


Let A be any n x n matrix. By Theorem 8.21, for i = 1,2,...,n, 


x (AS) < acy. 


Does it follow that 


Show that Ane > (4£2) for all n x n Hermitian matrices A 

and B. With P = (aa) and Q = (Fs show that for any given 

positive integer k > 2, one may choose a sufficiently small positive 

PE+Q* | (PAO se 
2 2 


real number «x such that has a negative eigenvalue. 


CHAPTER 9 


Normal Matrices 


Introduction: A great deal of elegant work has been done for normal 
matrices. The goal of this chapter is to present basic results and 
methods on normal matrices. Section 9.1 gives conditions equivalent 
to the normality of matrices, Section 9.2 focuses on a special type of 
normal matrix with entries consisting of zeros and ones, Section 9.3 
studies the positive semidefinite matrix (A*A)!/? associated with a 
matrix A, and finally Section 9.4 compares two normal matrices. 


9.1 Equivalent Conditions 


A square complex matrix A is said to be normal if it commutes with 
its conjugate transpose; in symbols, 


A*A= AA". 


Matrix normality is one of the most interesting topics in linear al- 
gebra and matrix theory, since normal matrices have not only simple 
structures under unitary similarity but also many applications. 

This section presents conditions equivalent to normality. 


Theorem 9.1 Let A = (aj;;) be an n-square complex matrix with 
eigenvalues 1, A2,.--,An. The following statements are equivalent. 


F. Zhang, Matrix Theory: Basic Results and Techniques, Universitext, 293 
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A is normal; that is, A*A = AA*. 


A is unitarily diagonalizable; namely, there exists an n-square 
unitary matrix U such that 


LU AU = diag(A1, Adyss25 An): (9.1) 


3. There exists a polynomial p(x) such that A* = p(A). 


MN 


6 FRA 


10. 
11. 
12. 


18. 
U. 
15. 
16. 
17, 
18. 


19. 
20. 
21. 
22. 
23. 


. There exists a set of eigenvectors of A that form an orthonor- 


mal basis for C”. 


. Every eigenvector of A is an eigenvector of A*. 


. Every eigenvector of A is an eigenvector of A+ A*. 


Every eigenvector of A is an eigenvector of A — A*. 
A=B+iC for some B and C Hermitian, and BC = CB. 


. If U is a unitary matrix such that U* AU = aa , where B 


and D are square, then B and D are normal and C' = 0. 
IfW CC” is an invariant subspace of A, then so is W+. 


an 


If x is an eigenvector of A, then x~ is invariant under A. 


A can be written as A = DS NE;, where \, € C and E; € Mp 
satisfy E? = B; = EX, BE; =0 fi FZ j, and B=. 


tr(A*A) = 32% |Aal?. 

The singular values of A are |A;|, |A2|, .--, |An|- 
yoy (Re Ax)? = 7 tr(A + A*)?. 

yop (Im Ay)? = —7 tr(A — A*)?. 

The eigenvalues of A+ A* are \y + 41,---,An +An- 


The eigenvalues of AA* are A1Aq(1);-++;AnAn(n) for some per- 
mutation m on {1,2,..., nm}. 


tr(A* A)? = tr ((A*)?A?). 

(AAS (Aa. 

|| Ax|| = ||A*a|| for all 2 € C”. 

(Az, Ay) = (A*a, A*y) for all x,y € C”. 
|A| = |A*|, where |A] = (A*A)!/?. 
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24. A* = AU for some unitary U. 

25. A* =VA for some unitary V. 

26. UP = PU if A=UP, a polar decomposition of A. 

27. AU =UA if A=UP, a polar decomposition of A. 

28. AP = PA if A=UP, a polar decomposition of A. 

29. A commutes with a normal matrix of no duplicate eigenvalues. 
380. A commutes with A+ A*. 

31. A commutes with A— A*. 

82. A+ A* and A— A* commute. 

383. A commutes with A* A. 

34. A commutes with AA* — A*A. 

35. A*B = BA* whenever AB = BA. 

36. A*A— AA* is a positive semidefinite matrix. 

37. |(Ax,x)| < (|Alz,x) for all x € C”, where |A| = (A*A)'/?. 


Proof. (2)<(1): We show that (1) implies (2). The other direction 
is obvious. Let A = U*TU be a Schur decomposition of A. It suffices 
to show that the upper-triangular matrix T = (t;;) is diagonal. 

Note that A*A = AA* yields T*T = TT*. Computing and equat- 
ing the (1,1)-entries of T*T and TT*, we have |ti1|? = |tu|? + 
ine |t1;|?. It follows that ti; = 0 if 7 > 1. Inductively, we have 
tj; = 0 whenever i < 7. Thus T is diagonal. 

(3)<(2): To show that (2) implies (3), we choose a polynomial 
p(x) of degree at most n — 1 (by interpolation) such that 


P(ri) 


— i Cn <2 eee, § 
Thus, if A = U* diag(\i,...,An)U for some unitary matrix U, then 


A* = U*diag(\i,...,An)U 

= U* diag(p(A1),..-,p(An))U 
U*p(diag(A1,..-,An))U 
p(U* diag(\1,.--,;An)U) 
= p(A). 
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For the other direction, if A* = p(A) for some polynomial p, then 
A*A = p(A)A = Ap(A) = AA*. 
(4)<(2): If (9.1) holds, then multiplying by U from the left gives 
AU = U diag(\i,...,An) 


or 
Au; = Ajui, Vien Ms 


where u; is the 7th column of U, i = 1,...,n. Thus, the column 
vectors of U are eigenvectors of A and they form an orthonormal 
basis of C” because U is a unitary matrix. 

Conversely, if A has a set of eigenvectors that form an orthonor- 
mal basis for C”, then the matrix U consisting of these vectors as 
columns is unitary and satisfies (9.1). 


(5)<(1): Assume that A is normal and let u be a unit eigenvector 
of A corresponding to eigenvalue A. Extend u to a unitary matrix 
with wu as the first column. Then 


U*AU = ( , i ) | (9.2) 


The normality of A forces a = 0. 

Taking the conjugate transpose and by a simple computation, u 
is an eigenvector of A* corresponding to the eigenvalue \ of A*. 

To see the other way around, we use induction on n. Note that 


Az=dAx & (U*AU)(U*2) = X(U*z) 


for any n-square unitary matrix U. Thus, when considering Ax = Az, 
we may assume that A is upper-triangular by Schur decomposition. 
Take e; = (1,0,...,0)". Then e, is an eigenvector of A. Hence, 
by assumption, e; is an eigenvector of A*. A direct computation of 
A*e, = pe, (for some scalar 4) yields that the first column of A* 
must consist of zeros except the first component. Thus, if we write 


{ra 0 :_ fm 8 
A=(% a then a(t ) 
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Every eigenvector of A is an eigenvector of A*, thus this property is 
inherited by B and B*. An induction hypothesis on B shows that A 
is diagonal. It follows that A is normal. 

(6)<>(5): Let (A+ A*)u = Au and Au = pu, u # 0. Then 
A*u = Au — Au = (A— p)u; that is, u is an eigenvector of A*. 
Conversely, let Au = Au and A*u = pu. Then (A+ A*)u = (A+yp)u. 

(7)<>(5) is similarly proven. 

(8)<(1): It is sufficient to notice that B = it and C = ATA’ 

(9)<>(1): We show that (1) implies (9). The other direction is 
easy. Upon computation, we have that A*A = AA* implies 


B*B B*C .f BR ECO -Cp 
CB OCLDD per = DD 


Therefore, 
B*B=BB*+CC* and C*C+D*D=DI. 


By taking the trace for both sides of the first identity and noticing 
that tr(BB*) = tr(B*B), we obtain tr(CC*) = 0. This forces C = 0. 
Thus B is normal, and so is D by the second identity. 

We have shown that the first nine conditions are all equivalent. 

(9)=(10): It suffices to note that C° = W@W? and that a basis 
of W and a basis of W+ form a basis of C”. 

(10)=(11)=(4): (11) is a restatement of (10) with W consisting 
of an eigenvector of A. For (11)=(4), if Ax = Ax, where x # 0, we 
may assume that x is a unit vector. By (11), 2+ is invariant under 
A. Consider the restriction of A on +. Inductively, we obtain a set 
of eigenvectors of A that form an orthonormal basis of C”. 

(12)=(1): It is by a direct computation. To show (2)=(12), we 
write U = (ui,...,Un), where u; is the ith column of U. Then 


A =U diag(\j,...,;An)U* = Aquzuy +++ + AnUnu;,. 


Take FE; = wus,i=1,...,n. (12) then follows. 
(13)<(2): Let A = U*TU be a Schur decomposition of A, where 
U is unitary and T = (t;;) is upper-triangular. Then A*A = U*T*TU. 
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Hence, tr(A* A) = tr(7*T). On the other hand, upon computation, 


n 


tr(A*A) = S7 lai)’, tr(*T) = SPD bl 


1, j=1 i<j 
Thus, tr(A*A) = 5°, |A;|? if and only if t;; = 0 for all ¢ < j; that 
is, T is diagonal and A is unitarily diagonalizable. 


(14)=(13): If the singular values of A are o1,...,0n, then 


tr(A*A) MiA*A) Hise t yi (A*A) 
of t-+-+0% 


JAa|? ++ + [Anl?, 


which is (13). For the other direction, obviously (13)=(2)=>(14). 


(15)=>(13): We may assume that A is an upper-triangular matrix, 
because the identity holds when A is replaced by U* AU, where U is 
any unitary matrix. Notice that 


(At A? =i A 4 ot A Ae lA. 


It follows that 


1 #\2 2 «)2 
5(t(4A+4 )? — tr A? — tr(A*)?). 


tr(A*A) = 
Since 4(Re \;)? = (A; + Ai)’, (15) implies (13). (15) follows from (2). 
Similarly, (16)=(13) and (2)=(16). 
(17)=(15): If the eigenvalues of A+ A* are Ay +A1,.--,An + An; 
then their squares are the eigenvalues of (A + A*)?. Thus, 


(A+ At)? = 37 dese Dy) x (Re \;) 
= 1 


(17) follows from (2) at once. 

(18)<(14): Obviously (14) implies (18). For the converse, sup- 
pose, without loss of generality, that A1,,(1) > 0 is the largest eigen- 
value of AA*. As is known, all |\;] < omax(A). If |Amqay| 4 |Aal, 
then |A1Aq(1)| < Ohax(A) = Amax(AA*), a contradiction. Hence, 


max 
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|An(1)| = |Ai]. On the other hand, \1A,(1) is a positive number. 
Thus, 1 = A,(1)- The rest follows by induction. 

(19) is immediate from (1). To see the other way, we make use 
of the facts that for any square matrices X and Y of the same size, 


iY) Str yx) 


and 


tr(X*X)=0 @& X=0. 
Upon computation and noting that tr(AA*)? = tr(A*A)?, we have 
tr ((A*A —AA*\*(AtA = AA*)) 
= tr(A*A— AA*)? 
= r(A* A)? — tr ((A*)? A’) — tr(A?(A")’) 4 tr ( AA’), 
which equals 0 by assumption. Thus, A*A — AA* = 0. 
(20)=>(19)=>(1)>(20). 


(21)<(1): By squaring both sides, the norm identity in (21) is 
rewritten as the inner product identity 


(Az, Ax) = (A*a, A* x), 
which is equivalent to 
(x, A* Ax) = (x, AA* 2), 


or 


(«, (A*A — AA*)z) = 0. 
This holds for all x € C” if and only if A*A — AA* = 0. 
(22)=>(21) by setting @ = y; (21)>(1)>(22). 
(23)<(1): This is by the uniqueness of the square root. 
(24)<(1): If A* = AU for some unitary U, then 


A* A = A*(A*)* = (AU)(AU)* = AA", 
and A is normal. For the converse, we show (2)=(24). Let 


A=V* diag(A1,...,An)V, 
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where V is unitary. Take 
Gav" diaethis« «cela VG 
where |; = *. if A; 4 0, and J; = 1 otherwise, for i = 1,...,n. Then 
A* = V*diag(A1,...,An )V 


= V* diag(A1,...,An)V V* diag(y,...,ln)V 
= AU. 


Similarly, (25) is equivalent to (1). 
(26)=(1): If A = UP, where U is unitary and P is positive 
semidefinite, then A*A = AA* implies 
Pi P=sUPry o PP SUP 
By taking square roots, we have 


P=UPO° @ PUH=HUP. 


The other direction is easy to check: A*A = P? = AA*. 
(27)<>(26): Note that U is invertible. 


(28)<>(26): We show that (28) implies (26). The other direction 
is immediate by multiplying P from the right-hand side. 

Suppose AP = PA; that is, UP? = PUP. If P is nonsingular, 
then obviously UP = PU. Let r = rank (A) = rank (P) and write 


xf DO 
PW a.) 


where V is unitary and D is r x r positive definite diagonal, r < n. 
Then UP? = PUP gives, with W = VUV*, 


D? 0 D 0 D O 
WC o)=(0 oJ” (0 0): 
Partition W as 


e& Wo 


Ws W, ) , where Wy isr xr. 
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Then 
W,D?=DW,D and W3D? =0, 


which imply, for D is nonsingular, 
W,D=DW, and W3=0. 


It follows that W2 = 0 because W is unitary and that 


D O D 0 
w(o o)-(0 o)™ 
This results in UP = PU at once. 

(29)<(2): Let A commute with B, where B is normal and all the 
eigenvalues of B are distinct. Write B = V*CV, where V is unitary 
and C = diag(ci,...,¢,) is diagonal and all c; are distinct. Then 
AB = BA implies WC = CW, where W = (wij) = VAV™. It follows 
that wijcq = wic;, Le., wij(c; — c7) = 0 for all ¢ and j. Since c; # c; 
whenever 7 # j, we have wij = 0 whenever 7 # 7. Thus, VAV* 
is diagonal; that is, A is unitarily diagonalizable, and it is normal. 
Conversely, if (2) holds, then we take B = U diag(1,2,...,n)U*. It 
is easy to check that B is normal and AB = BA. 


(30), (31), and (32) are equivalent to (1) by direct computation. 
(33)=(20): If A commutes with A* A, then 


AA*A = A*A?. 
Multiply both sides by A* from the left to get 
(APS (AA, 
(34)=(19) is similarly proven. (1) easily implies (33) and (34). 
(35)(1): Take B = A for the normality. For the converse, 


suppose that A is normal and that A and B commute, and let A = 
U* diag(\1,..-,;An)U, where U is unitary. Then AB = BA implies 


dae Aiysg Ag) (U BU) = (OBO diag Missa. 5 An): 
Denote T = UBU* = (tj). Then 


diag(A1, sees Ay l _ T diag(A1, tee 5 An) 
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which gives (A; — Aj)tij = 0; thus, (A; — A;)ti; = 0, for all i and j, 
which, in return, implies that A*B = BA*. 

(36)<(1): This is a combination of two facts: tr(XY —YX) =0 
for all square matrices X and Y of the same size; and if matrix X is 
positive semidefinite, then tr X = 0 if and only if X = 0. 

For (37), notice that |(Az,x)| < (|A|z,x) is unitarily invariant; 
that is, it holds if and only if |(U*AUz, x)| < (\U*AU|z, x), where U 
is any unitary matrix. (Bear in mind that |U*AU| = U*|A|U.) By 
the Schur decomposition, we may assume that A is upper-triangular. 

If A is normal, then A is unitarily diagonalizable. We may assume 
that A = diag(\1,..., An). Then the inequality is the same as saying 
that | >>, Ailzil?| < 35; |Aal|zi|?, which is obvious. 

For the converse, if |(Az,x)| < (|A|z,x) for all  € C”, where 
A is upper-triangular, we show that A is in fact a diagonal matrix. 
We demonstrate the proof for the case of n = 2; the general case is 


0 r2 

If Ay = Az = 0 and a £0, then take positive numbers s,t, s >t; 
set « = (s,t)’. Then |z*Az| = stla| and 2*|Alz = t?/a|. That 
|x* Ax| < x*|Al|x implies s < t, a contradiction. 

If Ay (or Ag) is not 0. Let |A| = ae Putting a =(1,0)" 
in |x*Aa| < x*|Alz gives |\1| < a. Computing |A|? = A*A and 
comparing the entries in the upper-left corners, we have a? + |b|? = 
|Ai|?. So b = 0. Inspecting the entries in the (1,2) positions of |A|? 
and A*A, we get ya =(a+c)b=0. Thusa=0. 8 

As many as 90 equivalent conditions of normal matrices have been 
observed in the literature. More are shown in the following exercises 
and in the later sections. 


similarly proven by induction. Let A = Abe ) . We show a = 0. 


Problems 


1. Is matrix ( : normal, Hermitian, or symmetric? How about & ') ig 
2. Let A be an n-square matrix. Show that for all unit vector x € C” 
|(Ax,z)| < (JAPa, 2). 


[Note: If the square is dropped, then A has to be normal. See (37).| 
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15. 


Show that each of the following conditions is equivalent to the nor- 
mality of matrix A € M,. 


(a) A has a linearly independent set of n eigenvectors, and any two 
corresponding to distinct eigenvalues are orthogonal. 


(b) (Aa, Ay) = (A*a, A*y) for all x,y EC”. 
(c) (Ax, Ax) = (A*a, A*x) for all 2 € C”. 
Let A, A2,---,An be the eigenvalues of matrix A € M,,. Show that 
|AiAj| < A1(A*A) 
for any pair ;,A;, where \;(A* A) is the largest eigenvalue of A* A. 


Let A be an n x n complex matrix with eigenvalues \; and singular 
values o; arranged as |A;| >--- > |A,| and a1 >--: > oy. Show that 


0109°+: OK = |AiA2+++ Akl, Kil, Dis xn; => A* A= AA*. 


Let A be a normal matrix. Show that Ax = 0 if and only if A*x = 0. 


Let A be a normal matrix. Show that if x is an eigenvector of A, 
then A*z is also an eigenvector of A for the same eigenvalue. 


If matrix A commutes with some normal matrix with distinct eigen- 
values, show that A is normal. Is the converse true? 


Show that unitary matrices, Hermitian matrices, skew-Hermitian ma- 
trices, real orthogonal matrices, and permutation matrices are all 
normal. Is a complex orthogonal matrix normal? 


. When is a normal matrix Hermitian? Positive semidefinite? Skew- 


Hermitian? Unitary? Nilpotent? Idempotent? 


. When is a triangular matrix normal? 


. Let A be a square matrix. Show that if A is a normal matrix, then 


f(A) is normal for any polynomial f. If f(A) is normal for some 
nonzero polynomial f, does it follow that A is normal? 


. Show that (33) is equivalent to (28) using Problem 31, Section 7.1. 
14, 


Show that two normal matrices are similar if and only if they have the 
same set of eigenvalues and if and only if they are unitarily similar. 


Let A and B be normal matrices of the same size. If AB = BA, 
show that AB is normal and that there exists a unitary matrix U 
that diagonalizes both A and B. 
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18. 


19. 


20. 
21. 


22. 
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24. 
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Let A be a nonsingular matrix and let M = A~!A*. Show that A is 
normal if and only if M is unitary. 


Let A € M, be a normal matrix. Show that for any unitary U € M,, 
min{|A;(A)|} < |\s(AU)| < max{|A:(4)|}- 


Let A be a normal matrix. If A* = I for some positive integer k, 
show that A is unitary. 


Let B be an n-square matrix and let A be the block matrix 


BB 
s-(2 B). 
Show that A is normal and that if B is normal with eigenvalues 
At = 24+ YK, Tt, Ut € R, a i a eee OP then 


det A = 41 II XtYt- 


t=1 
Show that C” = Ker A @ Im A for any n-square normal matrix A. 


Let A and B be n x n normal matrices. If Im AlIm B; that is, 
(x,y) = 0 for all 2 € ImA, y € ImB, show that A+ B is normal. 


Let A be a normal matrix. Show that 4A = 0 AAT = ATA=0. 


Let A be Hermitian, B be skew-Hermitian, and C = A+ B. Show 
that the following statements are equivalent. 


(a) Cisnormal. (b) AB=BA. (c) AB is skew-Hermitian. 
Show that for any n-square complex matrix A 
tr(A*A)? > tr ((4*)?4?). 
Equality holds if and only if A is normal. Is it true that 
(A* A)? > (A*)2.42? 


Verify with the following matrix A that A*A— AA%* is an entrywise 
nonnegative matrix, but A is not normal (i.e., A*A — AA* 4 0): 
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26. Let A and B be n xX n matrices. The matrix AB — BA is called the 
commutator of A and B, and it is denoted by [A, B]. Show that 


(a) tr[A, B] = 0. 

(b) [A,B]? = [B*, AX] 

(c) [A,B +C] = [A, B] + [A,C]. 

(4) [A,[B, Cl] +[B,{C, Al] +[C.[A, Bl] = 0 

(e) [A,B] is never similar to the identity matrix. 

(f) If A and B are both Hermitian or skew-Hermitian, then [A, B] 


is skew-Hermitian. 


(g) If A and B are Hermitian, then the real part of every eigenvalue 
of [A, B] is zero. 


(h) A is normal if and only if [A, A*] = 0. 
(i) A is normal if and only if [A,[A, A*]] = 
27. Complete the proof of Condition (37) for the case n > 2. 


28. Let A and B be normal matrices of the same size. Show that 


(a) AM =MA = A*M = MA’. 
(b) AM =MB => A*M = MB*. 
29. Let A and B be n-square matrices. 
(a) If A and B are Hermitian, show that AB is Hermitian if and 
only if A and B commute. 


(b) Give an example showing that the generalization of (a) to nor- 
mal matrices is not valid. 

(c) Show (a) does hold for normal matrices if A (or B) is such a 
matrix that its different eigenvalues have different moduli. 

(d) If A is positive semidefinite and B is normal, show that AB is 
normal if and only if A and B commute. 
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9.2 Normal Matrices with Zero and One Entries 


Matrices of zeros and ones are referred to as (0, 1)-matrices. (0, 1)- 
matrices have applications in graph theory and combinatorics. This 
section presents three theorems on matrices with zero and one entries: 
the first one shows how to construct a symmetric (normal) (0, 1)- 
matrix from a given (0, 1)-matrix, the second one gives a sufficient 
condition on normality, and the last one is on commutativity. 
Given an m x n (0, 1)-matrix, say A, we add the 1s in each row 
to get row sums 11,12,...,%m.- We call R = (r1,7T2,..-,Tm) the row 
sum vector of A and denote it by R(A). Similarly, we can define the 
column sum vector of A and denote it by S(A). For example, 


0 
R(A) = (r1,72,73, 14) = (1, 2.3, 2), 


S(A) = (s1, 82, 83, 84) = (2,3, 1,2): 


1 0 0 
1001 
ants. oa 

Lei go: 0 

The sum of the components of R(A) is equal to the total number 
of ls in A. The same is true of S(A). Given vectors R and S of 
nonnegative integers, there may not exist a (0,1)-matrix that has R 
as its row sum vector and S as its column sum vector. For instance, 
no 3x3 (0, 1)-matrix A satisfies R(A) = (3, 1,1) and S(A) = (3, 2,0). 
For what R and S does there exist a (0,1)-matrix that has R and S$ 
as its row and column sum vectors? This is an intriguing problem 
and has been well studied in combinatorial matrix theory. 

Apparently, for a symmetric (0,1)-matrix, the row sum vector 
and column sum vector coincide. The following result says the con- 
verse is also true in some sense (via reconstruction). Thus there exists 
a (0, 1)-matrix that has the given vector R as its row and column sum 
vectors if and only if there exists such a normal (0, 1)-matrix and if 
and only if there exists such a symmetric (0, 1)-matrix. 


Theorem 9.2 If there exists a (0, 1)-matriz A with R(A) = S(A) = 
R, then there exists a symmetric (0, 1)-matriz B that has the same 
row and column vector as A; that is, R(B) = S(B) = R. 
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Proof. We use induction on n. If n = 1 or 2, it is obvious. Let 
n > 2. We may also assume that A contains no zero row (or column). 
Suppose that the assertion is true for matrices of order less than n. 
If A is symmetric, there is nothing to prove. So we assume that 
A = (aij) is not symmetric. 

If the first column is the transpose of the first row, i.e., they are 
identical when regarded as n-tuples, then by induction hypothesis on 
the (n — 1)-square submatrix in the lower-right corner, we are done. 
Otherwise, because A is not symmetric and R(A) = S(A), we have 
Ap = 1, aig = 0, Api = 0, agi = 1 for some p and gq. 

Now consider the rows p and q. With ap, = 0,aq1 = 1, if ape < 
Agt,t = 2,...,n, then rp < rg. For the columns p and q, with ajp = 
1, @i¢ = 0, if ag > Qi, t = 2,...,7, then 8, > s,. Both cases cannot 
happen at the same time, or they would lead to rp < rg = 8q < Sp = 
rp, a contradiction. Therefore, there must exist a ¢ such that either 
Gg = 1 and eg = 0 or ay, = 0 and ag = 1. 

Now interchange the Os and 1s in the intersections of rows p,q 
and columns 1,t, or columns p,q and rows 1,t. Notice that such an 
interchange reduces the number of different entries in the first row 
and first column of A without affecting the row and column sum 
vectors of A. Inductively, we can have a matrix in which the first 
column is the transpose of the first row. By induction on the size of 
matrices, a symmetric matrix is obtained from A. UH 


Take the following matrix A as an example. In rows 2 and 4, 
replacing the 0s by 1s and 1s by Os (all xs remain unchanged) results 
in the matrix A; that has the same row and column sum vector as 
A, whereas the first column of A; is “closer” to the transpose of the 
first row of A, (or A). 


) Ai = 


ex OX 
x XK xX BR 
OX xXx 
x xX xX Oo 
OX xX 
x KX KX 
Pe xX OX 
x xX xX Oo 


In what follows, J,, or simply J, denotes the n-square matrix all 
of whose entries are 1. As usual, J is the identity matrix. 

The following theorem often appears in combinatorics when a 
configuration of subsets is under investigation. 
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Theorem 9.3 Let A be an n-square (0, 1)-matrix. If 
AA? =tIl+J 


for some positive integer t, then A is a normal matriz. 


Chap. 9 


(9.3) 


Proof. By considering the diagonal entries, we see that (9.3) implies 


that each row sum of A equals t + 1; that is, 
AJ = (t+ 1)J. 
Matrix A is nonsingular, since the determinant 
(det A)? = det(AA™) = det(tI + J) = (t+n)t” 
is nonzero. Thus by (9.4), we have 
Als=(t+1) 1. 
Multiplying both sides of (9.3) by J from the right reveals 
AATJ =tJ + J? = (t+n)J. 
It follows by multiplying A~! from the left that 
AI a4 Ie ene. 
By taking the transpose, we have 
JA=G41) NEw. 
Multiply both sides by J from the right to get 
JAS Salty“ es a). 
Multiply both sides of (9.4) by J from the left to get 
JAJ =n(t+1)J. 
It follows by comparison of the right-hand sides that 


(+1)? =t+n or n=? 4441. 


(9.4) 
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Substituting this into (9.5), one gets 
JA=(t+I1)J. 
From (9.4), 
AJ=JA or A'JA=J. 

We then have 

ATA = A-(AAT)A 

= AGI+J)A 

tI+A'JA 


i+ J 
= AA’. Bf 


Our next result asserts that if the product of two (0, 1)-matrices 
is a matrix whose diagonal entries are all 0s and whose off-diagonal 
entries are all 1s, then these two matrices commute. The proof of 
this theorem uses the determinant identity (Problem 5, Section 2.2) 


det (Im + AB) = det(I, + BA), 
where A and B are m x n and n x m matrices, respectively. 
Theorem 9.4 Let A and B be n-square (0, 1)-matrices such that 
AB = Jn — In. (9.6) 


Then 
AB=BA. 


Proof. Let a; and b; be the columns of A and BY’, respectively: 
A= (OGoncgta hy BS (0p 2550): 


Then, by computation, 


= tr(AB) = tr(BA) -y bf aj. 
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Thus bfa; = 0 for each 7, since A and B have nonnegative entries. 
Rewrite equation (9.6) as 


In =Jn—AB or In=In—>_acb5. 


Then 
In + aj; + ajb; = In — be. 
sft, 7 
Notice that the right-hand side contains n—1 matrices, each of which 
is of rank one. Thus, by the rank formula for sum (Problem 6), the 
matrix on the right-hand side has rank at most n — 1. Hence the 
matrix on the left-hand side is singular. We have 


0 = detUn + aibj + ajb;) 
det (Un + (ai, aj) (bi, bj)") 
= det (I2 + ( (bi, bj)" ) (ai, ay)) 


= 1 O 0 bf a; 
= (Co 1)*( gu °0”)) 
= 1 - (bja;)(bj aj). 


This forces ba; = 1 for each pair of i and j, i # j, because A and 
B are (0,1)-matrices. It follows, by combining with b?a; = 0, that 


BA = (b?a;) = Jn —In = AB. Of 


Problems 


1. Show that no 3 x 3 (0,1)-matrix A satisfies R(A) = (3,1,1) and 
S(A) = (3,2,0). But there does exist a 3 x 3 (0,1)-matrix B that 
satisfies S(B) = (3,1,1) and R(B) = (3, 2,0). 


2. Show that A € M,, is normal if and only if J — A is normal. 


3. Let A be an n-square (0, 1)-matrix. Denote the number of 1s in row 
i by r; and in column j by c;. Show that 


Aisnormal => r;=c; for eachi and J, — A is normal. 
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10. 


11. 


12. 


. Construct a nonsymmetric 4 x 4 normal matrix of zeros and ones 


such that each row and each column sum equal 2. 


. Let A be a3 x 3 (0, 1)-matrix. Show that det A equals 0, +1, or +2. 


. Let Ay,...,A, be m x m matrices each with rank 1. Show that 


rank (A; +---+ Ayn) <n 


. Let A be a (0,1)-matrix with row sum vector R = (r1,172,---,1m), 


where ry > rg > +++ > rm. Show that 


Ynsits s ee 


j=t+l1 


. Does there exist a normal matrix with real entries of the sign pattern 


+ + 0 
+ 0 + |? 
+ + 4+ 


. If Ais an n x n matrix with integer entries, show that 2Ax = x has 


no nontrivial solutions; that is, the only solution is x = 0. 


Let A be an n-square (0, 1)-matrix such that AA? = tI + J for some 
positive integer t, and let C = J — A. Show that A commutes with 
C and C7. Compute CC7. 


Let A be an n-square (0, 1)-matrix such that AA? = tI + J for some 
positive integer t. Find the singular values of A in terms of n and t 
and conclude that all A satisfying the equation have the same singular 
values. Do they have the same eigenvalues? When is A nonsingular? 


Let A be av x v matrix with zero and one entries such that 
AA?’ =(k—A)I+AJ, 
where v, k, and h are positive integers satisfying 0<h<k<v. 
(a) Show that A is normal. 
(b) Show that h = v=rh(h —1). 
(c) Show that A~* = gg Eon (RAT — eae 
(d) Find the eigenvalues of AA’. 
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9.3. Normality and Cauchy—Schwarz—Type Inequalities 


We consider in this section the inequalities involving diagonal entries, 
eigenvalues, and singular values of matrices. The equality cases of 
these inequalities will result in the normality of the matrices. 


Theorem 9.5 (Schur Inequality) Let A = (aij) be an n-square 


complex matrix having eigenvalues 1, A2,.--,An- Then 
nm n 
2 2 
SU < lawl? 
i=1 i, j=l 


Equality occurs if and only if A is normal. 


Proof. Let A = U*TU be a Schur decomposition of A, where U is 
unitary and T is upper-triangular. Then A*A = U*T*TU; conse- 
quently, tr(A* A) = tr(T*T). Upon computation, we have 


tr(A* A) — x |ai;|? 


45.9=1. 


and 
nm 


tr(I*T) = S > |Al? + So [tig ?. 
i=l i<j 
The inequality is immediate. For the equality case, notice that each 
tij = 0, 7 < Jj; that is, T is diagonal. Hence A is unitarily diagonal- 
izable, thus normal. The other direction is obvious. 


An interesting application of this result is to show that if matrices 
A, B, and AB are normal, then so is BA (Problem 11). 


Theorem 9.6 Let A = (aij) be an n-square complex matrix having 
singular values 01,02,...,0n. Then 


|tr A| <o, +---+o0n. (9.7) 


Equality holds if and only if A= uP for some P > 0 and someu EC 
with |u| = 1; consequently, A is normal (but not conversely). 
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Proof. Let A = UDV bea singular value decomposition of A, where 
D> Mae ip.) Wh Gy > es S op > pay HU Heese = 
r = rank (A), and U and V are unitary. By computation, we have, 


nw tt 
) ) UjjOjV4i 


i=1 j=l 


n on 
) ) UigVFiO7"5 


j=l i=1 


|tr A] = 


n 


2 


j=1 


3 (> juss 0; 


j=l ‘i=l 


n 


IA 


oF 


UigV5i 
i) 


i= 


IA 


IA 
M 
7 


The last inequality was due to the Cauchy—Schwarz inequality: 


nm nm n 
do leagegsl S 4] Do beig? DO leysl? = 1. 
i=1 i=1 i=1 
If equality holds for the overall inequality, then 


n 
) Uig Vii 
i=1 


Rewrite YL, wijvji as (uj, v7), where uj is the jth column of U and 
v; is the jth row of V. By the equality case of the Cauchy—Schwarz 
inequality, it follows that uj; = cjv;, for each j < r, where c; is a 
constant with |c;| = 1. Thus, by switching v; and v;, | tr A| equals 


nm 
= S> \wevp| = 1, foreach 7 < r. 
i=1 


| tr(cyoyvzU1 + +++ + CpopUpU,)| = |epoyuyUy +++ + cpap up; |. 


Notice that vjvy = 1 for each i. By Problem 6 of Section 5.7, we 
have A=c,V*DV. The other direction is easy to verify. 1 
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There is a variety of Cauchy—Schwarz inequalities of different 
types. The matrix version of the Cauchy—Schwarz inequality has 
been obtained by different methods and techniques. Below we give 
some Cauchy—Schwarz matrix inequalities involving the matrix 


|A| = (A*A)!/?, the modulus of A. 


Obviously, if A = UDV is a singular value decomposition of A, 
where U and V are unitary and D is nonnegative diagonal, then 


|AJ=V*DV and |A*|=UDU". 
Theorem 9.7 Let A be an n-square matrix. Then for any u,v € C", 
|(Au, v)/? < ([Alu, «)(|A*]a, v) (9.8) 


and 


|((Ao A*)u, u)] < ((JA] 0 [A*])u, u). (9.9) 
Proof. It is sufficient, by Theorem 7.29, to observe that 
i A* )=(4 ae: eae 0 Jeo 
A |A*| 0 & D D 0 UF} 
and, by Theorem 7.21, 
|AJo|A*| A*oA |A| A* |A*| A 
(Maca! yarjeiat) = C14! jar) Jo (ae! sa) 2° 
Note that |A| o|A*| = |A*| o|A|, with u = v, gives (9.9). I 
For more results, consider Hermitian matrices A decomposed as 
A=" diagOas «<4 AQ)U, 
where U is unitary. Define A® for a € R, if each A® makes sense, as 
A® = U* diag(Af,..-,A,)U. 
Theorem 9.8 Let A€ M,. Then for any a € (0,1), 


|(Au,v)| < |Al*ull LA" Ful], uo eC”. (9.10) 
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Proof. Let A = U|A| be a polar decomposition of A. Then 
A=U|Al*U*U|Al* =|A*| °UAl*. 

By the Cauchy—Schwarz inequality, we have 

|(Au, v)| = (UJ Alu, |A*%v)| < [Alu] ]A*| Su]. 

Note that (9.8) is a special case of (9.10) by taking a = 5. 

Theorem 9.9 Let AGM, anda € R be different from 5. If 

|(Au, u)| < (|Alu, u)(|A*lu,u)'-%, for allue C”, (9.11) 
then A is normal. The converse is obviously true. 


Proof. We first consider the case where A is nonsingular. 

Let A= UDV be asingular value decomposition of A, where D is 
diagonal and invertible, and U and V are unitary. With |A| = V*DV 
and |A*| = UDU*, the inequality in (9.11) becomes 


\|(UDVu, u)| < (V*DVu, u)*(UDU*u, u))% 
or 
(DY? Vu, DP Uw) < (DP Vu, DIP Va) (DP 0a, Dota, 


For any nonzero u € C”, set 


1 


= uty, 
||D1?20* ul] . 


y 


Then ||y|| = 1, and y ranges over all unit vectors as u runs over all 
nonzero vectors. By putting A = D'/2VUD~!/2, we have 


\(Ay, y)| < (Ay, Ay)®, for all unit y € C”. 
Applying Problem 19 of Section 6.1 to A, we see A is unitary. Thus, 


D2 VY DVUD 7 =T. 
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It follows that 
UDU* =V* DV. 


By squaring both sides, we have 4A* = A*A, or A is normal. 

We next deal with the case where A is singular using mathemat- 
ical induction on n. If n = 1, we have nothing to prove. Suppose 
that the assertion is true for (n — 1)-square matrices. 

Noting that (9.11) still holds when A is replaced by U* AU for 
any unitary matrix U, we assume, without loss of generality, that 


_[{ Ar b 
4=(% 0): 
where A; € M,_; and 6 is an (n — 1)-column vector. 
If b = 0, then A is normal by induction on A,. If b 4 0, we take 


u, € C™! such that (b,u1) 4 0. Let u = (3) with ug > 0. Then 


|(Au, u)| = |(Arua, ur) + (0, ur) us| 


and 


(|A*|u, u) = ((AL Aq + 0b*) 1/24, us) 


which is independent of uz. To compute (|A|u, u), we write 
C od 


b*b = d*d + p. 
Hence, 8 4 0; otherwise d = 0 and thus b = 0. Therefore, 6 > 0 and 


Then 


(Alu, u) = (Cur, u1) + ua((d, ur) + (ur,d)) + Bud. 


Letting uz — oo in (9.11) implies 2a > 1 or a > §. 
With A replaced by A* and a by 1 — a, we can rewrite (9.11) as 


|(A*u, u)| < (JA*u, u)@(JAlu, w)-O-%. 


Applying the same argument to A*, one obtains a < 5: By the 
induction hypothesis, we see that A is normal if a # 5: | 
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Problems 


ie 


For what value(s) of « € C is the matrix A = (° al normal? Hermi- 
tian? Diagonalizable? Show that A” is always normal if n is even. 


. Let A be a square complex matrix. Show that 


tr|A|=tr|A*|>|trA| and det |A] = det|A*| = | det Al. 


. Give an example showing that the unitary matrix U in the decom- 


position A = U* diag(\i,...,An)U is not unique. Show that the 
definition A* = U* diag(Af,...,A°)U for the Hermitian A and real 
a is independent of choices of unitary matrices U. 


. Let A E M,, H= 3(A+ A*), and S = 3(A— A*). Let rt = at + byt 


be the eigenvalues of A, where az, ); are real, t = 1,...,n. Show that 


n n 
So lad? <3, = SO bel? < ISI. 
t=1 t=1 


. Show that for any n-square complex matrix A 


[A] A* |A| A 
( A | A*| = 0, but A* |.A*| £0 


in general. Conclude that 


Show that it is always true that 


A B A Be 
é 4) 20 (3 4 eo 


. For any n-square complex matrices A and B, show that 


|A|+|B| A*+B* 0 
A+B |A*|+|B*| ] - 


Derive the determinant inequality 
(det |A + B|)? < det(|A| + |B]) det(|A*| + |B*|). 


In particular, 
det |A + A*| < det(|A| + |A*]). 


Discuss the analogue for the Hadamard product. 
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7. Show that an n-square complex matrix A is normal if and only if 
|ju*Au| <u*|Alu for all we C”. 


8. Let A be an n-square complex matrix and a € [0,1]. Show that 


| A|?° A* 
( A |Aeeh=2) 2 0. 


9. Let A and B be n-square complex matrices. Show that 
AA=B*B © |A\=(|Bl. 


Is it true that 
AA>B*B © |Al>|B? 


Prove or disprove 


|A|> |B) << |A*| > |B". 


10. Let [A] denote a principal submatrix of a square matrix A. Show by 
example that |[A]| and [|A|] are not comparable; that is, 


[A]] 2 [|Al] and [Al] 2 [Al]. 


But the inequalities below hold, assuming the inverses involved exist: 


(a) [IA|?] 2 [AIl, 

(b) [|A|?] = I[A]l?, 

(c) (lA?) < [AI 
(d) [|AJ-¥/7] > [Al], 
(e) (lA[4] s [lA|-7]?. 


11. If A and B are normal matrices such that AB is normal, show 
that BA is normal. Construct an example showing that matrices 
A, B, AB, BA are all normal, but AB 4 BA. 
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9.4 Normal Matrix Perturbation 


Given two square matrices, how “close” are the matrices in terms of 
their eigenvalues? More interestingly, if a matrix is “perturbed” a 
little bit, how would the eigenvalues of the matrix change? In this 
section we present three results on normal matrix perturbations. The 
first one is on comparison of |A| —|B| and A—B in terms of a norm, 
and the second one is on the difference (closeness) of the eigenvalues 
of the normal matrices in certain order. The third result is of the 
same kind except that only one matrix is required to be normal. 


Theorem 9.10 (Kittaneh) Jf A,B arenxn normal matrices, then 
Al — |B] ll2 < ||A — Blle. 


Proof. By the spectral theorem, let A = U* diag(\j,...,An)U and 
= V* diag(111,...,/2n)V, where U and V are unitary matrices. 


For simplicity, let C = diag(|A1|,...,|An|), D = diag(|ua|,..-, |tnl) 
and W = (wij) = UV*. Then, upon computation, we have 
Al -|Blll2 = |U"CU-V*DV |e 
= ||CUV* —UV*D|2 
= ||CW-WD\p 
a 1/2 
= (0 C= lash)? wis?) 
i,j=l 
a 1/2 
< ( De [As — 145? [ei ?) 
ce 
1/2 
= ( s |(A — bj )wis| *) 
1491 


= |[diag(A1,...,An)W —W diag(u,...,n)ll2 
= |A-Bll. a 
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Corollary 9.1 Let A and B be n x n complex matrices. Then 


Al - [Bl ll2 < V2||A- Ble. 


B* 0 
Hermitian (normal). Applying Theorem 9.10 to A and B, we have 


Proof. Let A = a4) and B = ( My as Then A and B are 


|| Al — [BU II2 = IIA] — |B 112 + IIA" — |B" 112 < 2.4 — Bll. 


The desired inequality follows immediately. I 


Theorem 9.11 (Hoffman—Wielandt) Let A and B be nx n nor- 
mal matrices having eigenvalues \1,...,An and [y1,..-, hn, Tespec- 
tively. Then there exists a permutation p on {1,2,...,n} such that 


n 1/2 
(Ssh Ha < ||A- Bll. 
i=1 


Proof. Let A = U* diag(Ay,...,An)U and B = V* diag(j1,--., tin)V 
be spectral decompositions of A and B, respectively, where U and 
V are unitary matrices. For simplicity, denote E = diag(\1,...,An), 
F = diag(i1,..., fn), and W = (wij) = UV*. Then 


|A- Bl |U*(BUV* — UV*F)V|I3 


EW —WF|3 


Ss a= se P lee. (9.12) 


i,j=l 


Set G = (|A; — 4;|?) and S = (|w,;|*). Then (9.12) is rewritten as 


JA — BIS = $2 [Ai — yl? lwig?? = e7 (Go Sle, 
ij=l 


where GoS is the Hadamard (entrywise) product of G and S, and e is 
the n-column vector all of whose components are 1. Note that S isa 
doubly stochastic matrix. By the Birkhoff theorem (Theorem 5.21), 
S is a convex combination of permutation matrices: S = 7", tiPi, 
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where all ¢; are nonnegative and add up to 1, P; are permutation 
matrices. Among all e7(Go P,)e, i= 1,...,m, suppose e7(Go P, Je 
is the smallest for some k. Consider this P, as a permutation p on 
the set {1,2,...,n}. Then 


e’(GoS)e= S tie (G o P;)e 


a=1 


d tie’ (Go Pye 


= (G2 Pe= Do - Mpayl. 


|A — Bla 


IV 


The Hoffman—Wielandt theorem requires both matrices be nor- 
mal. In what follows we present a result in which one matrix is 
normal and the other is arbitrary. For this, we need a lemma. For 
convenience, if A is a square matrix, we write A = U4+ D4+ La, 
where U4, D4, and Ly, are the upper part, diagonal, and lower part 
of A, respectively. For instance, A = a) = C + (a) + oa) : 


Lemma 9.1 Let A be ann Xn normal matrix. Then 
[Walle <vVn—-1|Lalle, ||Zalle < v2 -1||Galle. 


Proof. Upon computation, we have 


n-1l nn 
Walls = So So laa? 

i=1 j=i4+1 
n-1 nn 

< S) Dd G- Alay? 
i=1 j=i+l 
n-1l nn 

= S° SO G-+)laij!? (Problem 12) 
j=l i=jt+l 

< (n-1) 3 3 lai? = (n — 1)||Lall. 

j=l i=jt+l 


The second inequality follows by applying the argument to A”. JH 
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Theorem 9.12 (Sun) Let A be an n x n normal matriz having 
eigenvalues r1,...,An and B be anyn xn matrix having eigenvalues 
[l1,-++;[n- Then there is a permutation p on {1,2,...,n} such that 


“ / 
os BS —_ tips?) : < Vni||A — Blo. 
i=1 


Proof. By the Schur triangularization theorem (Theorem 3.3), there 
exists a unitary matrix U such that U* BU is upper-triangular. With- 
out loss of generality, we may assume that B is already upper- 
triangular. Thus Dg = diag(tW1,..., Un). Let C= A— B. Then 


A-—Dp=C+4+Up, Up=U4g-—Uc, La=tLec. 


Because A and Dz are both normal, the Hoffman—Wielandt theorem 
ensures a permutation p on the index set {1,2,...,} such that 


i syle 
(So )i-sp@l?) © <A - Dalla = C+ Valle. 
i=1 


Now we apply the lemma to get 


|C+UB|s = ||C+U4—Uell3 
a Lo +Do+Uall? 
= |[Lolls + ||Dell3 + Walls 
< ||Lollo + Dells + (n = 1)||Lalli 
= |[Lollg + ||Dells + (n - 1)||Lells 
< nilCl|z = n||A— Bll. 


Taking square roots of both sides yields the desired inequality. 


Problems 


1. If A is a normal matrix such that A? = A, show that A is Hermitian. 
2. If A is a normal matrix such that A? = A?, show that A? = A. 


3. Let A = (a;;) be a normal matrix. Show that |A| > max;,; |a;;| for 
at least one eigenvalue A of A. 
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10. 


11. 


12. 


. Let A = (a;;) be a normal matrix. If K is ak xk principal submatrix 


of A, we denote by (kK) the sum of all entries of K. Show that 
k|A| > max, |=()| for at least one eigenvalue \ of A. 


. Let A= (a;;) be an n-square complex matrix and denote 


my, = max |ajj|, m2 = max |ajj + 4j;|/2, m3 = max |ajy — G;;|/2. 
4) 2,7 4,9 
Show that for any eigenvalue \ = a+ bi of A, where a and b are real, 


|A| <nmi, jal <nme, |b] <nmsz. 


. Let A= (a;;) € M, and denote m = max;,; |a;;|. Show that 


|det A] < (mV/n)”. 


. Let A = (a,;) be an n x n real matrix and d = max;,,;{|a;; — a;;|/2}. 


Let \ = a+ bi be any eigenvalue of A, where a,b are real. Show that 


p< fea 


. Let A = (a,j) be an n x n normal matrix having eigenvalues A, = 


a, + bi, where a; and } are real, t= 1,...,n. Show that 


1 2 1 7 
max |a;| > max | = (aj; +43], max |b,| > max | 5 (aj; = ai|. 
t ij ' 2 t ij 12 


. Show that Theorem 9.12 is false if not both A and B are normal by 


the example A= (?9) and B= (75'7'). 


Show that the scalar 2 in Corollary 9.1 is best possible by consid- 


ering a with A= (j¢) and B=()%) asa 0. 


Let A be a normal matrix partitioned as (2 i where FE and H are 

square matrices (of possibly different sizes). Show that || F'||2 = ||Gll2. 

Use Problem 11 to show that for any n x n normal matrix X = (2;,), 
n—-1l on n—-l nn 


dD G- Dea? = 2 DS @- Des? 


i=1 j=it+1 j=l i=jt+1 
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13. 


14. 


15. 


16. 


17. 
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Show that the scalar \/n in Theorem 9.12 is best possible by consid- 
ering the n x n matrices 


0 1.0. ues 0 0 1 0 0 
0 0 1... 0 0 0 1 0 
al oe 2 Be I a ee st 
0.0 0 «22 4 00 0... 1 
10 0 ... 0 0 0 0... 0 


Let A be a Hermitian matrix. If the diagonal entries of A are also the 
eigenvalues of A, show that A has to be diagonal, i.e., all off-diagonal 
entries are 0. Prove or disprove such a statement for normal matrices. 


Let A and B be n x n normal matrices having eigenvalues Aj,...,An 


and j11,.--, ln, respectively. Show that 


min max |A; — Mpi)| < n||A — Blloo, 
Dp i 


where p represents permutations on {1,2,...,n} and ||A — B]|.o is 
the spectral norm of A — B, i.e., the largest singular value of A — B. 


Let A € M, and let A;,...,An be the eigenvalues of A. The spread 
of A, written s(A), is defined by s(A) = max,,; |A; — A;|. Show that 
(a) s(A) < V2||All2 for any A € M,,. 
(b) s(A) > V3max;z; |ai;| if A is normal. 
(c) s(A) > 2max;z; |a;,| if A is Hermitian. 
Let A, B, and C be n x n complex matrices. If all the eigenvalues of 
B are contained in a disc {z € C: |z| < r}, and all the eigenvalues 


of A lie outside the disc, ie., in {z € C: |z| > r+d} for some d > 0, 
show that the matrix equation AX — X B = C has a solution 


X= yA HOBt 
i=0 


Show further that if A and B are normal, then ||X|lop < 4||Cllop, 
where || - ||,» denotes the operator norm. [Hint: Use Theorem 4.4.] 


CHAPTER 10 


Majorization and Matrix Inequalities 
Introduction: Majorization is an important tool in deriving matrix 
inequalities of eigenvalues, singular values, and matrix norms. In this 


chapter we introduce the concept of majorization, present its basic 
properties, and show a variety of matrix inequalities in majorization. 


10.1 Basic Properties of Majorization 


Two vectors in R” can be compared in different ways. For instance, 
one vector may be longer than the other one when measured in terms 
of norm (length); one may dominate the other componentwise. In 
this section, we introduce the concept of majorization, with which we 
may compare two real vectors and see whose components are “less 
spread out” or if one vector “contains” or “controls” the other. 


Let x = (21, %2,...,2%n) € R". We rearrange the components of x 
in decreasing order and obtain a vector x! = (rj,x3,..., 2), where 


Similarly, let a < al < +++ < a} denote the components of x in 
increasing order and write x? = (a see ..., 21). For example, 


v =(-1,-2,3), 2 =(3,-1,-2), xt = (—2,-1,3). 
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For @ = (2j, 852605 %y,) and 9 = (Ui, 01+; 04) mR, if 


k k 
Se RS ee 1, (10.1) 
1=1 i=1 


and 
nm nm 
=e (10.2) 
i=1 oak 


we say that y majorizes x or x is majorized by y, written as x < y or 
y > x. If the equality (10.2) is replaced with the inequality }>y"_, a; < 
eis We say that y weakly majorizes x or x is weakly majorized 
by y, denoted by x Xy y or y yw x. Obviously, x ~ y > © ~y y. As 
an example, take = (—1,0,1), y = (3,—2,—1), and z = (3,0,0). 
Then x < y and y <, z. Of course, © <y z. 

Note that the positions of the components in the vectors are 
unimportant for majorization; if a vector x is majorized by y, then 
any vector of reordering the components of x is also majorized by y. 
The inequalities in (10.1) may be rewritten in the equivalent form: 


k k 
max yee max ye k=1,2,...,n—1. 


1<i1 <+-<ips<n Sty <<ips<n 
Su Se St Kn 


For the case of n = 2, intuitively, the set {v € R?: x < y} fora 
given y € R? is the line segment joining y‘ and 4". 


Yy 


Figure 10.8: Majorization 
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Bor 2 = (Gis. 55 Ge) let [a] = ail scces eg) )e Por aye RY, 
x+y and roy are, respectively, the componentwise sum and product 
of x and y. Apparently, « < y (componentwise) implies x <,, y. For 
the sake of convenience, sometimes we simply write x € R” to mean 
that © = (41, %2,...,2%n), where each x; € R. 

It is easily checked from the definitions that majorization < and 
weak majorization <, are transitive binary relations on R”: 


CAY YR eS BX Zs; DAs<wy, Y A~Aw Ze > T Xy Z 


Theorem 10.1 Let x, y, z € R”. Then 


tp y Say sy anda Ay > yu, <2; < yj for all x. 
.2X2,y<~2 => prt+qy <2, where p,q>0,p+q=1. 
Ly ZY ~w % => pet+qy <yw 2, where p, g>0,p+q=1. 


1 

2 

3 

4. 4~Y SU RXy y and —x Xy —y. 

5.x2~<y,y~2 & x=yP for some permutation matrix P. 
6 


» LAW Ys Y ~wt & c=yP for some permutation matri« P. 


Proof. The first part of (1) is obvious from the definition by taking 
k =1. For the second part of (1), we show y!, < x}. x < y reveals 


Subtracting the inequality from the equality yields xf, > ys. 
(2) and (3) are similar. We show (3). Let u = pxr+qy. Then 
U ~w Z is equivalent to, for any k = 1,2,...,n, 


k 
<S° (pat +ay}) 
i=1 


roa +s (p+q) 2S oe 


1=1 4=1 


a) 
# 
I 
eed 
Ss 
8 
+ 
= 


We now show (4). If z < y, then x <y y. Moreover, )>7_, 27 = 
” y: and fe gt < a y; for all k < n. Subtracting the 
=1 41 ae i=l Ft 
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inequality from the equality yields S77)" eg — Uae 


—k =i; ee 
Thus, 077) —(2*, joy) S Sm Oe aia fs Noticing that 


(8; id) = (-2);, —(Un-i41) = (yj 


we have -""*(—2)! < S087 (—y)! for all k= —1,,.,,2, 1. With 
yo (—2); = 1 (—y);, we conclude —x <,, —y. For the converse, 
it is sufficient to show that )7;", 2; = S77, y;. We have x <y y 
implies \7j., 2; < do}, y;- On the other hand, —z <, —y reveals 
the reversed inequality. Thus, the equality has to hold. So x ~ y. 
(5) and (6) are similar. We show (5). If « < y and y ~ a, then 
ry < yy and yj < zy. Thus zj = yj. Let & and g be the vectors 
obtained from x and y by deleting xj and yj, respectively. Then 
& < y and y < %. From the above argument, we have Zj = 9, Le., 
ry = y- Inductively, x} = y} for all 7. This says that the components 
of x are rearrangements of the components of y; that is, x = yP for 


some permutation matrix P. The converse is trivial. 


Our next theorem best characterizes the relationship between the 
weak majorization <,, and the majorization ~ via the componentwise 
dominance <. This theorem is used repeatedly. 


Theorem 10.2 The following statements are equivalent. 


1. © Xy y, where x, y € R”. 
2.4a< 2 andz~<y for some z € R”. 
8 x£~<uandu<y for some u € R”. 


Proof. (1)<(2): It is easy to see that (2)=(1). We show the converse 
by induction on the number of components. If n = 1, it is obvious. 
Let n > 1 and suppose it is true for vectors with less than n 
components. We may assume that the components of 7 and y are 
already in decreasing order. Let € = ming { 0, (yi — x;)} > 0 and 
&=2+(c,0,...,0). Then = Z', x < %, and & <, y, as for each p, 
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If € is attained when & = m, then 77", % = D>/"1 yi. This yields 


(Tigeesgtin) A Wizaca gts. Cintiigesks tn) ac Ginaser seta 
By induction, we have a real vector (2m41,---,2n) such that 
(Petia nts an) S (eeitissceety tS mits secs ty) 
Set z = (%1,...,;£m;2m+1;---;2n)- This z serves the purpose. 


(1)<(3): One easily checks that (3)=(1). We show the converse. 
Let y, be the smallest component of y. Let 6 = S0yU, yi — Oyu 
and u = y — dex, where e, € R” has component 1 in the kth position 
and 0 elsewhere. Then it is easy to verify thatz~<~uandu<y. IH 


Theorem 10.3 Let x, y€ R” andu, v € R”. Then 

Leu => Go) 4 (G0). 

2. yp Ue yt = Gh) Hy (0): 

R2K<yuK<v>atu<yt+v' (whenm=n). 

4. Bw Y, Uw > C+U Ry y +0 (when m=n). 
Proof. (1) is similar to (2) and (3) is similar to (4). We show (2) 
and (4). Let @ = (x,u) and y = (y,v). For positive integer k < n, 
suppose that the first k largest components of % consist of xj,..., xt 
and uj,...,us, r+s=k. Since x <y y and u <y v, we have 


k r s k 
Dea dot dis vu +s rs 
i=1 i=1 i=1 i=l 


This says that @ is weakly majorized by 4, i.e., (7, u) <w (y, v). 
(4) is proven in a similar way by checking that 


k k k k 
Diet+ue < diait+ dus Divi + doe 
i=1 i=1 i=1 i=1 i=1 

k k 
= Swit) = +e) 
i=l i=1 
k 
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Theorem 10.4 Let x = (21,...,2n), y = (Y1,---, Yn) € R”. Then 


1, ty [2], 

2. |a + y| <w lal + lyl’. 

3. att+ytK~atyHnK~austy'. 

4. Vie typ Sy wiys S WL Bey}. 


Proof. (1) is due to the fact that for each k = 1,2,...,n 


k k k 
otis Vlei s Dilek: 


i=1 i=1 i=1 


In a similar way, that ys lr + yl; < lal + |yl|;) implies (2). 

The second inequality in (3) is a consequence of Theorem 10.3 (3). 
To show the first one, we may assume that x = x‘. Consider the case 
n = 2 first. If yy < yo, then y = y? anda’ +yt =a+yxn~ arty. 
If yy > yo, then y = y' andxrt+y=a2it+y'’. As yt < y*, by 
Theorem 10.3 (3), we have a1 + yt ~at+y'=a2+4+y. 

Let n > 2. Our goal is to obtain y* by repeatedly exchanging 
components y so that they are in increasing order. If y = y", there 
is nothing to prove. Suppose y; > y; for some i and 7, 7 < j. Switch 
the components y; and y; in y and denote the resulting vector by y. 
Note that a pair of components y;, y; in y now are in increasing order 
Yj, yi in y. Observe that x+y and x + y differ by two components. 
By Theorem 10.3(1) and the above argument for the case of n = 2, 
weseex+y~<~a+y. If y = y', we are done. Otherwise, by the 
same argument, we have y so that +4 ~ x+y and y has two 
more components than y that are in increasing order. Repeating the 
process reveals at +y? <---Xa+9Y~2u2+Y~a2r+4+y. 

For (4), again, we assume x = x‘ and show the first inequality; 
the proof for the second one is similar. If y = yt, then we have 
nothing to prove. Otherwise, let 7 be the vector as above. Compute 


ayy; + xyyi — Lyi — 2iy; = (az — &}) (yj — ys) < 0; 


that is, 
US Ui + LH; = Tey + Vy S py + TFy;- 
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This yields yy, 3H < Do¢-1 rye. Likewise, S71 TiGe < Dey Te H- 
Repeat this process until yt is obtained. Thus (4) follows. 

The inequalities in Theorem 10.4 (4) can be generalized to ma- 
jorization inequalities when the components of w and y are all non- 
negative; in other words, the upper limit n for the summation can 
be replaced by k = 1,2,...,n. (See Theorem 10.16.) 

Denote by R’ the set of all vectors in R” with nonnegative com- 
ponents; that is, u = (ui, u2,..-,Un) € R{ means that all u; > 0. 


Theorem 10.5 Let x = (21,...,2n), y = (Y1,---,Yn) € R”. Then 


n n 
r<y © Sy for all w= (u1,...,Un) € R” 
i=1 


i=1 
and 
n n 
Lx<~wy © xjuy < So ytus for all w= (Wiss 2+5%m) ERE. 
i=1 i=1 


Proof. We show the one for weak majorization. The other one is 
similar. “<=” is immediate by setting u = (1,...,1,0,...,0) in which 
there are k 1s, k = 1,2,...,n. For “=”, let t; = yj — aj for each i. 
Then x <w y implies ys t; > 0 fork = 1,2,...,n. Compute 


n n nm 
Ling oe ee 4 
Dd vins — D wing = D_ tin 
i=l i=1 i=l 


= ty(ut — uh) + (tr + ta)(uh — uf) +--- 
+ (ty +--+ +tn—1)(ut_, ut) (t) eee tn)Uy, = 0. 


Therefore, the desired inequality follows. I 


Theorem 10.6 Let x, y, u,v © RY. Then 


1.t<yy > ToUurK<yy'ou'. 


2. Si UY Ay S BS Se ow: 
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Proof. We first show that (r1u1,...,%nUn) <w (yjuy,-..-,y,us). By 
setting uz,,, =--- = us, = 0 in Theorem 10.5, we obtain 


k k 
Seu = i, ba 1a (10.3) 


Note that all components of x, y, and u are nonnegative. For any 
positive integer k <n and sequence 1 < 71 <--- < ip, <n, we have 
ay, < ay and uw, < uy, t=1,...,k~ Tt follows that 


k k 

a 
y Li S y tie; k= le: 
t=1 t=1 


With (10.3), (1) is proven. For (2), apply (1) twice. J 


Problems 


1. Find two vectors x, y € R® such that neither x < y nor x > y holds. 
2. Let y = (2,1) € R?. Sketch the following sets in the plane R?: 


{ce €R?:a2~<y} and {xe R?:a %, y}. 


3. Let x, y € R”. Show that (—x)* = —(a*) and «—y ~ a —y!. 


4. Let x = (2,...,2,) € R”. Show that 2} = es seeds 


ba AD eas ts 

5. Let x = (41, %9,...,%n) and y = (y1, y2,---, Yn) be in R”. Show that 
Big oe BL) eg, 0 Woe EH 1 oa 

6. Let « = (#1,%2,...,%,) € R”. If a, > a2 > --- > 2, show that 
sy ti > + OL, wi for any positive integer m < n. 

7. Let a1,a2,...,@, be nonnegative numbers such that ys a; = 1. 
Show that (4,4,...,4+) < (a1,a@2,...,@n) < (1,0,...,0) and that 


Un ~ (Un—1,0) ~ +++ X (v2, 0,...,0) ~ (1,0,...,0), 


1 


where uz = (Fogo g) with k copies of ¢ for k = 1D ye aig Ms 


8. Referring to Theorem 10.6 (2), can ut ov! be replaced by uo v? 


9. Let z, y € R”. Show that « <~ y @ —ax x —y. If & <y y, does it 
necessarily follow that —x <, —y (or —y <w —2)? 
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10. 
abe 


12. 
13. 


14. 
15. 


16. 
17. 


18. 


19. 


20. 


21. 
22. 


23. 
24. 


Let x, y, 2 ER”. Ife+y <u z, show that 7 <,, z—y'. 


Let x, y € R” and x < y (componentwise). Show that «P < yP for 
any n X n permutation matrix P and consequently xt < y*. 


Let e = (1,1,...,1) € R”. Find all 2 € R” such that x ~ e. 


Let & = (a1, %9,...,2,) € R” and & = +(4, + 22 +---+ 2). Show 
that Ze ~ x, where e = (1,1,...,1) € R”. State the case of n = 2. 


Let x, y € RY. Show that (x,y) <w (x+y, 0). 
Let z, y € R”. If x =~ y and if0<a< 6 <1, show that 


Ba’ + (1— B)y* ~ an’ + (1—a)y’. 


Let x, y € R”. Show that x < y © (a, z) X (y, z) for all z € R™. 
Let x, y € R” and z € R”. Show that 


(2,2) ~w (y,z) > © Xw y. 
Consider the more general case. If (%,u) < (y, v) for some u, v € R™ 


satisfying u < v, does it necessarily follow that # < y or % <y y? 


Let x, y € R” such that « <,, y. Show that (i) there exists y € R” 
which differs from y by at most one component such that x < y; and 
(ii) there exist a,b € R such that (a,a) < (y,). 


Let 2, y, z € RY. If 2a <y y + z*, show that 
(x, £) Xw (y, 2) <w (y* + 2°, 0): 


Let x = (21, 2%2,...,%,) € R" and a be areal number such that xi < 
a < xy. ee -+2n. Show that (a, 2=* vis 22) Xa. 


nm—1? n 


Let x, y € R”. If x  y, show that y!, > xt 


— m 


> 4 for some m. 


Let t € R and denote t* = max{t,0}. For 2, y € R”, if x <, y, show 
that (zf,...,2¢) <w (yf,-.--, yt). Is the converse true? 


Give an example that Theorem 10.6 is not valid for some x, y € R”. 


Let w = (a1,.--,2n), y = (Y1,---, Yn) € R”. If a X y, show that 


Youtul < outs <ouist < om yt, w= (u,.--,Un) € R”. 
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10.2 Majorization and Stochastic Matrices 


Recall that a doubly stochastic matriz is a square nonnegative matrix 
whose row sums and column sums are all equal to 1. In symbols, A 
is doubly stochastic if A is nonnegative and for e = (1,...,1) € R", 


eA=e and Ae’ =e’. 


In this section, we show a close relation between majorization and 
this type of matrices. Our goal is to present two fundamental results: 
x < y if and only if « = yP for some doubly stochastic matrix P; 
x <yw y if and only if « = yQ for some doubly substochastic matrix Q. 
A doubly substochastic matrix is a square nonnegative matrix whose 
row and column sums are each at most 1, i.e., eA < e, Ae™ < e*. 


Theorem 10.7 Let A be ann x n nonnegative matrix. Then A is 
doubly stochastic if and only if xA < x for all (row vectors) x € R”. 


Proof. For necessity, by Theorem 5.21, we write A as a convex 
combination of permutation matrices P;, P2,..., Pm: 


m m 
A=) aF, 5 ae a; = 0. 
= = 


Because xP; < x for each 7, we have 


m m 
tA= ) aj;xP; X< ) ayx = x. 
f=1 i=l 


For sufficiency, take x = e = (1,...,1). Then eA ~ e says that 
each column sum of A is at most 1. However, adding up all the 
column sums of A, one should get n. It follows that every column 
sum of A is 1. Now set « = e; = (0,...,1,...,0), where 1 is the 
ith component of e; and all other components are 0. Then e;A ~ e; 
means that the ith row sum of A is 1 fori =1,2,...,n. I 


We next take a closer look at the relation between two vectors 
when one is majorized by the other. Specifically, we want to see 
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how to get one vector from the other. This can be accomplished by 
successively applying a finite number of so-called T'-transforms. 
A 2x 2 T-transformation, or transform for short, is a matrix of 


the form Ge ar where 0 <t <1. For higher dimension, we call a 
matrix a T-transform if it is obtained from the identity matrix I by 
replacing a 2 x 2 principal submatrix of J with a 2 2 T-transform. It 
is readily seen that a T-transform is a doubly stochastic matrix and 
it can be written as t+ (1—t)P, where 0 < t < 1 and P is a permuta- 
tion matrix that just interchanges two columns of the identity matrix 
I. Note that when t = 0, the T-transform is a permutation matrix. 
Since a permutation on the set {1,2,...,} can be obtained by a se- 
quence of interchanges (Problem 22, Section 5.6), every permutation 
matrix can be factorized as a product of T-transforms. 


Theorem 10.8 Let x, y € R”. Then x ~ y if and only if there exist 
T-transforms Ti, ..., Tm such that « = yT,---Tm. Consequently, 


rt<y ©& x=yD for some doubly stochastic matrix D. 


Proof. Sufficiency: This is immediate from the previous theorem. 
To prove the necessity, we use induction on n. If n = 1, there is 
nothing to show. Suppose n > 1 and the result is true for n — 1. 

If x; = yi, then (x2,...,2n) < (Yy2,---,Yn). By induction hy- 
pothesis, there exist T-transforms $1, $2,..., Sm, all of order n — 1, 
such that (to;0s.e%y) = (Yove.sgty 61827 Sas Leet, = Gar 
Then T; is also a T-transform, 7 = 1,2,...,m, and # = yT,T>---Thy. 

If x; = yj for some 7 and j, we may apply permutations on x and 
y so that x; and y; are the first components of the resulting vectors. 
Since every permutation matrix is a product of T-transforms, the 
argument reduces to the case x1 = y; that we have settled. 

Let x; 4 yj for all 7,7. We may further assume that x and y are 
in decreasing order. By Problem 21 of Section 10.1, « < y implies 


Yk > Lk > Yr+1, for some k. 


So 
re = tyr +(1—t)yps1, for some t € (0,1). 
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Let To be the J-transform with Gz ‘) lying in the rows k and 
k+1 and columns k and k+1. Set z = yJp. Then z and y have the 


same components except the kth and (k + 1)th components: 


Zp = tyr + (1 —t)¥ns1 = 2k 


and 
Arua = (1 —t)de + te =e + Ge — Ze: 
Note that 


k 


k k-1 k 
Sm = So uit Ch =x < ye 
i=l i=l i=l 


i=1 


For r#k,ie,r<korr>k, bearing in mind that zp + zp41 = 
Yk + Yk+1, we always have 


i Tr Y fg ce 

_ L 
yes ne) AS) 2, 
i=l i=l i=l i=l 


and equality holds when r = n. Hence, x < z. 

Now x and z have the same component x}; = zp. By the above 
argument, we have that « = 27,7---Ty, = yToT1T2:--Tm, where 
T;s are T-transforms. & 

Every doubly stochastic matrix is a convex combination of (finite) 
permutation matrices (Theorem 5.21, Section 5.6), therefore we may 
restate the second part of the theorem as x ~ y if and only if there 
exist permutation matrices P,,..., Pm such that « = tyyP, +---+ 
tmyPm, where ty +---+tm = 1, and all t; > 0. Thus, for given 
y € R", {x : x X y} is the convex hull of all points in R” obtained 
from y by permuting its components. 

We now study the analogue of Theorem 10.8 for weak majoriza- 
tion. A weak majorization <,, becomes the componentwise inequal- 
ity < via J-transforms. When the vectors are nonnegative, the weak 
majorization can be characterized by doubly substochastic matrices. 


Theorem 10.9 Let x, y € R". Then x Xy y if and only if there 
exist T-transforms T,,T2,...,Tm such that x < yT,T2---Tmn. 
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Proof. If « <, y, by Theorem 10.2, there exists z € R” such that 
x < z~ y. By Theorem 10.8, z = y7\T7---Tm, where Tjs are 
T-transforms. It follows that 7 < yT\T)---Typ. 

Conversely, let u = y7T|To---Ty,. Then « < u, so x <y u. How- 
ever, u < y as T{75---Tj, is doubly stochastic. Thus, 7 <, y. 


Theorem 10.10 Let x,y ¢ RY. Then x Xy y if and only if x = yS 
for some doubly substochastic matrix S. 


Proof. Let x = yS' for some doubly substochastic matrix S. Then 
there exists a stochastic matrix P such that P — S is a nonnegative 
matrix. Thus, yS < yP for y ¢ RY. AsyP <y,2=yS <yP <y. 

For the converse, since « <, y, by Theorem 10.9, there exist 
T-transforms 7),7>,...,Zm such that « < yT7\7)---Tm. Denote 
z= yT\To:--Tp. Then x < z. Note that x is nonnegative. By 
scaling the components of z to get x; that is, taking r; so that 
u= arji,0< 7; < 1,7 = 1,2,...,n, we have a diagonal matrix 
R= diag(ri, ree ij) such that ¢ = 2k. SeS = Ty Io++-7,,R. 
Then S is doubly substochastic andaw=yS. 


For given y € Ri, {x :  <yw y} is the convex hull of all points 
(t1Yp(1), t2Yp(2)>+++stnYp(n)), Where p runs over all permutations and 
each t; is either 0 or 1. Note that both x and y are required to 
be nonnegative vectors in Theorem 10.10. The conclusion does not 
necessarily follow otherwise (Problem 14). 


Problems 


1. Find a doubly stochastic matrix P such that (1, 2,3) = (6,0,0)P. 


2. Show that Q = (qi) is a doubly substochastic matrix if there exists 
a doubly stochastic matrix D = (d;;) such that qj; < dj; for all 2, 7. 


3. Let A= (a;;) and B = (b;;) be n x n matrices. Show that 
(a) If A and B are doubly substochastic, then C = (a,;b,;) and 
D = (\/a;;b;; ) are also doubly substochastic. 


(b) If A and B are unitary, then E = (|a,;b;;|) is doubly substochas- 
tic, but F = (4/|ai;b;;| ) is not in general. 
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Show that the following matrix A satisfies (i) eA > e, Ae™ > e7; (ii) 
A — Q is never nonnegative for any doubly stochastic matrix Q: 


G at 
A={| 100], e=(1,1,1). 
10 0 


. Show that the following doubly stochastic matrix cannot be expressed 


as a product of T-transforms: 


NIF NIF © 
NIF OO NF 
SO NIF NIF 


. Let P be a square matrix. If both P and its inverse P~! are doubly 


stochastic, show that P is a permutation matrix. 
Show each of the following statements. 
(a) The (ordinary) product of two doubly stochastic matrices is a 
doubly stochastic matrix. 


(b) The (ordinary) product of two doubly substochastic matrices is 
a doubly substochastic matrix. 


(c) The Hadamard product of two doubly substochastic matrices 
is a doubly substochastic matrix. 


(d) The Kronecker product of two doubly substochastic matrices is 
a doubly substochastic matrix. 


(e) The convex combination of finite doubly stochastic matrices is 
a doubly stochastic matrix. 


(f) The convex combination of finite doubly substochastic matrices 
is a doubly substochastic matrix. 


8. Show that any square submatrix of a doubly stochastic matrix is 


doubly substochastic and that every doubly substochastic matrix can 
be regarded as a square submatrix of a doubly stochastic matrix. 


A square (0,1)-matrix is called sub-permutation matrix if each row 
and each column have at most one 1. Show that a matrix is dou- 
bly substochastic if and only if it is a convex combination of finite 
subpermutation matrices. 
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10. 


11. 


12. 


13. 


14. 


15. 


16. 


Let A = (a;;) be an n x n doubly stochastic matrix. Show that n 
entries of A can be chosen from different rows and columns so that 
their product is positive; that is, there exists a permutation p such 
that []j"_, aipgi) > 0. [Hint: Use the Frobenius—Kénig theorem] 

A matrix A of order n > 2 is said to be reducible if PAPT = (2 o 
for some permutation matrix P, where B and D are some square 
matrices; A is irreducible if it is not reducible. A matrix A is said to 


be decomposable if PAQ = (2 }) for some permutation matrices P 


and Q, where B and D are some square matrices; A is indecomposable 
if it is not decomposable. Prove each of the following statements. 


(a) If A is a nonnegative indecomposable matrix of order n, then 
the entries of A"~! are all positive. 


(b) The product of two nonnegative indecomposable matrices is 
indecomposable. 


(c) The product of two nonnegative irreducible matrices need not 
be irreducible. 


Let x, y € RY. Show that x <, y if and only if x is a convex 
combination of the vectors yQ1, yQo,.--, yQm, where Q1, Q2,---,Qm 
are subpermutation matrices. 


Let A be an n X n nonnegative matrix. Show that A is doubly 
substochastic if and only if Ax <,, @ for all column vectors x € R’. 


Give an example that + € Rt, y € R", x = yS for some substochastic 
matrix S, but « <, y does not hold. 


Let x, y € R”. Show that for any real numbers a and 8, 


u~<y => (ax14+b, ax2+b,...,a%,+b) ~ (ayi +b, ayo+b,...,aYyn+d). 


a 


Let T = er. = (22) eo 1,0<6 <1. Show that 


5 r=1-(a+0d), 


for every positive integer n. Find T” as n — oo. 
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10.3. Majorization and Convex Functions 


This section is devoted to majorization and convex functions. Recall 
that a real-valued function f(t) defined on an interval of R is said to 
be increasing if x < y implies f(a) < f(y), and convex if for all x,y 
in the interval and all nonnegative numbers a, 3 such that a+ 8 = 1, 


flax + By) < af(x) + Bly). (10.4) 


A function is strictly convex if the above strict inequality holds when- 
ever zr #y,a,3€(0,1),a+6=1. f is concave if —f is convex. 

A general and equivalent form of (10.4) is known as Jensen’s 
inequality: let f : 1+ R be a convex function on an interval I C R. 
Let t1,...,tn be nonnegative numbers such that }*i'_, t; = 1. Then 


f( > ti) < >. t;f(x;), whenever all x; € I. 
i=l i=l 


If f(t) is a differentiable function, from calculus we know that f(t) 
is increasing if the first derivative is nonnegative, i.e., f’(t) > 0 on the 
interval, and f(t) is convex if the second derivative is nonnegative, 
ie., f(t) > 0. Geometrically, the graph of a convex function is 
concave upwards. For example, |t| is convex on (—oo, 00) Also, if 
f (x) is twice differentiable and f(x) > 0 then f(z) is strictly convex. 
For instance, t? and e! are strictly convex on (—oo, 00), whereas Int 
is strictly concave on (0,00). 

Below is a reversal inequality of Jensen type. 


Theorem 10.11 Let f : Ry, R be a strictly convex function with 
f(0) <0. If a4,...,a, are nonnegative numbers and at least two x; 


are nonzero, then 
Y fe) < f(a). 
i=l 1 


i= 


Proof. We prove the case n = 2. The general case is shown by 


induction. Let 71,22 > 0. Write 71 = rere al +22)+ ar 0. Then 
x x 
f (a1) < ——f (a1 + 22) + —*—f (0). 


U1 + XQ t+ 2X2 
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In a similar way, we have 


f(x2) < 


Ty 


fO)+ 


41+ 29). 
t+ 22 Eres | : 2) 


Adding both sides of the inequalities, we arrive at 


f(ai) + f(z2) < f(@1 + £2) + (0). 


The desired inequality follows at once as f(0) <0. J 


We may generalize the definitions of convex and concave functions 
defined above to functions on R” or on a convex subset of R”. For 
instance, f is convex on R” if (10.4) holds for all z,y € R”. A real- 
valued function ¢ defined on R” (Rj, in most cases, or even more 
generally, a convex set) is called Schur-convez if 


xy => (x) < dy). 


As an example, the function ¢(2) = |a1| +--+ + |zp| is Schur- 
convex on R”, where x = (%1,...,%n). Since if x < y, we can write 
x = yA, where A = (a;;) is ann x n doubly stochastic matrix. Then 


d(x) = Dll = 33] Do aus 
i=l i=1 j=l 
< PY aly = (Yai)ion 
i=1 j=l j=l i=l 


S lyjl = o(y)- 
= 


One may prove that ¢(x) = |a1|? +--+ + |an|? is Schur-convex on 
R” too. In fact, if f(t) is convex on R, then $(x) = f(a1)+---+f (an) 
is Schur-convex on R” (Problem 6). 

The Schur-convex functions have been extensively studied. In 
this book we are more focused on the functions defined on R that 
preserve majorizations. When we write f(x), where x € R”, we 
mean, conventionally, that f is a function defined on an interval that 
contains all components of x, and that f is applied to every compo- 
nent O12: that 1s, if a= (%i..12. 9%), then J (oe) = (F(t) jcx29f Sa))- 
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In such a way, x? = (2?,...,22) and Inz = (Ing,...,mz,). By 


x € R”, we automatically assume that the ith component of x is x;. 
The following theorem is useful in deriving inequalities. 


Theorem 10.12 Let x, y € R”. If f is convex, then 
oxy = f(x) Xw f(y); 

if f 1s increasing and convex, then 
TXwy = f(x) Xw f(y). 


Proof. Since x < y, by Theorem 10.8, there exists a doubly stochas- 
tic matrix A = (a,j) such that « = yA. This reveals 


n 
i= ) AyiYs, eS 1D eng. 
j=l 


Applying f to both sides, and because f is convex, we have 


n 


faa S| on) t= 12s asst: 


j=l 
Therefore, 


(F(@1),--+sF(@n)) S (F1)s +--+ FYn))A- 


It follows that f(x) < f(y)A. By Theorem 10.8, the first part of the 
conclusion is immediate. For the second part, it suffices to note that 
x ~<w y ensures x < z < y for some z € R” (see Theorem 10.2). JW 


Corollary 10.1 Let x, y € R”. Then 


1 y= |z| Su lale es (lily ccx5 | tal) Se Cla, sea5 Bal). 
228 pS eS ey ie) Se Wee 


8 Ing <ylny > & Xy y, where all x; and y; are positive. 


Proof. For (1) and (2), it suffices to notice that |¢| and ¢? are convex, 
whereas (3) is due to the fact that e’ is increasing and convex. UM 


Majorization may be characterized in terms of convex functions. 
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Theorem 10.13 Let x,y ¢R”. Then 


l.a~K<y & YY, f(a) < 1 f(y) for all convex functions f. 


22 <~wy & Dei fle) < 1 fyi) for all increasing and 
conver functions f. 


Proof. Necessities are immediate from Theorem 10.12. For sufficien- 
cies, we need to show that 4 a ae b= lies. Bor 
any fixed k, take f(r) = (r — yz)+, where t? = max{t,0}; that is, 
f(r) =r—y;, if r > yz, 0 otherwise. Then f(r) is a convex function 
in r and the condition )>_, f(y) > So¢_1 f (xi) reveals 


k k n 
wm) = YK@-wt =doet- wt 
i=l i=l i=l 

n k 
> Sa} — yb) > Soak - yp)* 
t=1 i=1 
k 
> > yp: 
t=1 


This implies (2): 


k k 
ee ee i.e., 2 yw Y- 


i=1 i=1 


For (1), take g(x) = —ax. Then g(x) is a convex function and this 
gives >i", aj > S0y_, yi. So equality has to hold anda<y. UW 


The next theorem is useful when an equality is in consideration. 


Theorem 10.14 Let x,y € R”. If y is not a permutation of x, 
then for any strictly increasing and strictly convex function f that 
contains all the components of x and y, 


sayy = SFG) <> iy) 
i=1 7=1 
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Proof. Since x <  y, by Theorem 10.7, there exists a z € R” such 
that © < z<y. Let z= (z,...,2n). By Theorem 10.13, we have 


n 


> fla) < 7 fle) < Fly) 
1=1 4=1 i 


We show that in these inequalities at least one strict inequality holds. 
If z is a permutation of x, then z cannot be a permutation of y. 
Thus z, < yz for some k. As f is strictly increasing, f(z,) < f (yx). 
The second inequality is strict because f(z;) < f(y) for all i 4 k. 
If z is not a permutation of xz, there exists a nonpermutation, 
doubly stochastic matrix A = (a;;) such that = zA. This reveals 


n 
ry = ) Ajiz55 Dee a ae 
j=l 


Applying f to both sides, and since f is strictly convex, we have 


nm 
ei) SY ans le; io ee eee Os 
j=l 


and at least one strict inequality holds. It follows that 


a (ai) ope (z;) ey at (y;) )< Ew (yj). 


i=1 j=1 j=1 i=1 


Our next result is useful in deriving matrix inequalities and is 
used repeatedly in later sections. For this purpose, we introduce 
log-majorization. Let x = (#1,...,%) and y = (y1,---,Yn) € RY; 
that is, all 2; and y; are nonnegative. We say that x is weakly log- 
majorized by y and write it as © <yiog y if 


If equality holds when k = n, we say that x is log-majorized by y and 
write it as © <;,. y. In the event that all components of x and y are 
positive, © <,,.. y is the same as Inz <, Iny, and x <,,, y if and 
only if Inz < Iny, for Int and e! are strictly increasing functions. 
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Theorem 10.15 Let x, y ¢ RY. Then 


x “wie Y => x <w YS; 


that is, 
k k k k 
a + +f 
[le$s[[bat.un = Dats abet. 
i=1 i=1 i=1 i=1 


Proof. If all components of x are positive, then all components of y 
are positive. In this case, x <,,,. y yields Inz <, Iny which results 
in x <y y by Corollary 10.1(3). 

If x contains zero components, say, x t,. = 0 and ae > 0 for all 
i < k, then, by the above argument, (j,... ta) uy (yi). Up a): 
So x= (Bixee i Oia 0) a (Wickently ict pees =y. Of 

Note that the converse of Theorem 10.15 is not true. For example, 
take x = (3, 2,1) and y= (4,1,1). Then x = y, but @ Ave y- 


Theorem 10.16 Let x, y € RY. Then 
coy! <p roy <yxrroy (10.5) 


and 
nm nm nm 
[[@ + y?) 2 []@ +) 2 I]: + y?). (10.6) 
i=1 i=1 =1 


Proof. We may only consider the case of positive x and y; otherwise 
we replace the zero components of x or y with arbitrarily small pos- 
itive numbers and use a continuity argument. Note that (Ina)! = 
In(a). By Theorem 10.4(3), Inat+Iny’ ~ Inz+Iny ~ Ing +Iny'; 
that is, n(x‘ o yt) < In(a#oy) ~ In(x‘ oy). Corollary 10.1(3) reveals 
(10.5). For (10.6), applying the convex function f(t) = —Int to 
z+ytxa+y<a2'+y’' (Theorem 10.4(3)), Theorem 10.12 reveals 


—In(x' + y*) Xw —In(a + y) <w —In(a* + y"). 
This yields 


In(zy + yj) +--+ In(z} + y}) 
> In(v1 + y1) +--+ + ln(an + yn) 
> In(at + yt) +--+ + In(at + yd). 
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The desired inequalities in (10.6) then follow immediately. 


We point out that if 2 or y contains negative components, then 
the conclusion does not necessarily follow. For instance, take « = 
(—1,-2), y = (1,2). Then roy Xy x! oy! does not hold. 

We end this section with the result (Problem 13) for 2, y € R": 


n nm 
z<x~y => [l= = I[u. 
i=l i=l 


Problems 


1. Let x, y € R”. If x < y (componentwise), show that x <,, y. 


2. Let a > 1. Show that f(t) = t® is strictly increasing and strictly 
convex on R, and that g(t) = |t|* is strictly convex on R. 


3. Show that f(t) = e°’, a > 0, is strictly increasing and strictly convex 
on R and that g(t) = V‘ is strictly concave and increasing on R,. 

4. Show that f(t) = In(+ — 1) is convex on (0, 4) but not on (3,1). 

5. The following inequalities are of fundamental importance. They can 
be shown in various ways. One way is to use induction; another way 
is to use Jensen inequality with convex functions (f(x) = — Ina, say). 


(a) Use Jensen inequality to show the general arithmetic mean— 
geometric mean inequality: if alla; > 0, p; > O and 37"_, p; = 1, 
n n 
ii a < So piai. 
i=1 


i=1 


(b) Use (a) to show the Hélder inequality: if p, q > 1 and ate =1, 


n n 1/p n 1/q 
S| a;bi| < (>: oa (>: pi) 
41 i=l i=1 


for complex numbers aj,...,@n, 01,...,0n- 


(c) Use (b) to show the Minkowski inequality: if 1 <p <o, 


n 1/p n 1/p n 1/p 
(Ssle:+0 < (>: oa + (sm 
i=l i=1 i=1 


for complex numbers aj,...,@n, 01,...,0n- 
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6. Show that (i) if f(t) is convex on R, then fi (x) = f(v1) +---+ f(an) 
and fo(x) = (f(a#1),..-,;f(an)) are Schur-convex on R”; and (ii) if g 
is convex on R”, then g is Schur-convex on R”. 


7. Let f(t) be a nonnegative continuous function defined on an interval 
ICR. If f is (strictly) convex on I, show that F(x) = []}_, f(a) is 
(strictly) Schur-convex on I” = {(a1,...,%n):@1,---,2n € I} CR”. 


8. Give an example of convex, nonincreasing function f(t) for which 
(f(a1),.--,f(@n)) <w (f(y1),---;f(Yn)) is not true even if 7 <w y. 


9. Let x, y € R”. Prove or disprove: 


(a) & <w y => |2| <w lyl, Le., (leil,---,|2n]) <w (lyals-+ +> lynl)- 

(6): lel dan Ly] Se =e, Te. (Fn 8) Sy 7 ong) 
(Che Of Shak” ig ue (i cn ee) See tte 

(dQ) a yi a a a? 1s, 8g cag?) Sy (UF nog th) 

(e) e<y= |z)? <w lyl?, ie, (z1[*,--+5 lanl?) <u (yal... [el®)- 
(f) @~w y => e® ~y eY, ie., (e71,...,€7") ~w (e%,...,€4"). 


10. Let 2, y € R”. Show that |a* — y'| <. |a — y|. 


11. Let 2, ye R%, « < y. Show that 7", rE 2 YS. Ju for each k. 
12. Show that the following functions are Schur-convex on R”. 


(a) f(z) = max; |z;|. 
(b) g(a) = Dj, |ail?, p> 1. 


(d) p(x) = >°_, +, where all x; > 0. 


t=1 2? 


13. Let x, y € R%. If x < y, show that Jj, «; > []j_, yi and that the 
strict inequality holds if y is not a permutation of x. Show by example 
that this is invalid if x or y contains nonnegative components. 


14. Let v,y € R}. Show that the sum inequalities a a) < ae y, 
(k < n) imply the product inequalities i eS ee y! (k<n). 


15. Let x, y, z be the three interior angles of any triangle. Show that 


: , ; 3 
0<sinx+siny+sinz < 53. 
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16. 


ale 


18. 


19. 
20. 


21. 
22. 


23. 


24. 
25. 
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Let a1,...,@, and 6;,...,b, be positive numbers. Show that 
Gb Be Ben) + on 
by Fs ? bt, WwW by ? ? by WwW bt,’ 7 by 
Let x1, x, be positive numbers such that 71+---+2, = 1. Prove 


Gee 00) a a 

(D) Seg Sel), gy = eae So 
) 
) 


ye l+e; > n(n+1) , n—-1> oe l-ti > n(n—1) | 


n—-1 ? 


(cd) 3g ty dn + < Inn. 


t=1 1l+a; — n+1 


Let A be an n x n positive semidefinite matrix, n > 1. Show that 
tre4 < e4 + (nm —1) with equality if and only if rank (A) < 1. 


Let 2, y € RY. If x <p, y and x < y, show that zt = y'. 
For real number t, denote tt = max{t,0}. Let 2, y € R”. Show that 


(a) x < y if and only if YW (a—t)t < _ (wi—t)* forallteR 
and )ja1 i = Via Ye 
(b) x =< y if and only if 3°, |ai —t| < SL, |ys — ¢t| for all te R. 
Show that both words “strictly” in Theorem 10.14 are necessary. 


Let c,y € Rv. Ife <w y, show that x” <, y" for all integers 

m > 1, where 2” = (z?",...,27") for z = (21,...,2n) € R”. If 
m ~y™ for all integers m > 1, show that x! = y*. 

Let g be a differentiable function on an interval I C R. Show that 
(a) g is convex if and only if g(“42) < $(g(a)+g(0)) for all a,b € I. 
(b) g is linear if and only if (44°) = $(g(a) + 9(b)) for all a,b € 1. 
(c) g(x) ~ g(y) whenever x < y, x, y € R”, if and only if g is linear. 

If z,y,u,v € R%, show that © <y1o¢ U, Y <wiog UV > LOY ~wiog Ubout. 


Let a, b € R” and all components of a and 6 be positive. Show that 


i=1 =] 
where r and s are the numbers such that r > . >s> P is bowers a 
[Hint: Use the Kantorovich inequality for A = diag(¢t,..., $).] 
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10.4 Majorization of Diagonal Entries, Eigenvalues, 
and Singular Values 


This section presents some elegant matrix inequalities involving di- 
agonal entries, eigenvalues, and singular values in terms of majoriza- 
tion. Especially, we show that the relationship between the diagonal 
entries and eigenvalues of a Hermitian matrix is precisely character- 
ized by majorization. It has been evident that majorization is a very 
useful tool in deriving matrix inequalities. 

We denote the vectors of the diagonal entries, eigenvalues, and 
singular values of an n-square complex matrix A, respectively, by 


d(A) = (di(A), d2(A),...,dn(A)), 
NA) = (Ar (A), A2(A),- +. An(A)), 
a(A) = (01(A), 02(A),...,;0n(A)). 
The singular values are always arranged in decreasing order. In the 


case where A is Hermitian, all d;(A) and 4;(A) are real; we assume 
that the components of d(A) and (A) are in decreasing order too. 


Theorem 10.17 (Schur) Let H be a Hermitian matriz. Then 
d(H) < \(#). 


Proof. Let Hy, 1 < k <n, be the k x k principal submatrix of H 
with diagonal entries d\(H),d2(H),...,d,(H). Theorem 8.10 yields 


k k k 
Sa) =.=) Ais GD. 
i=l 41 il 


Equality holds when k = n, for both sides are equal to trH. & 


It is immediate that if H is a Hermitian matrix and U is any 
unitary matrix of the same size, then 


d(U* HU) ~ X(4#). (10.7) 
The next result shows the validity of the converse of Theorem 10.17. 


Thus the relationship between the diagonal entries and eigenvalues 
of a Hermitian matrix is precisely characterized by majorization. 
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Theorem 10.18 Let d = (di, dg,... tla) —— (Ai, dQ, 0085 An) € 
R”". If d ~ A, then there exists ann x n real symmetric matrix H 
that has diagonal entries d,,d2,...,dy and eigenvalues 1, A2,...,An- 


Proof. The proof goes as follows. We first deal with the case of 
n = 2, then proceed inductively. For n > 2, we construct a new 
vector (sequence) 2’ that differs from by at most two components 
one of which is some dj, i.e., \’ is a step “closer” to d. An application 
of induction results in the desired conclusion. We may assume that 
d,,dg,...,dy and Ay, A2,..., An are arranged in decreasing order. De- 
note diag d = diag(dj, d2,...,dp,) and diag A = diag(Aq, A2,.--,An)- 

If n = 2 and Aj = Ag, then d, = dg. The statement is obviously 
true. If n = 2 and Ay > As, because (di,d2) ~ (A1,A2), we have 
Ag < do < dy < Ay and dy + dy = A, + Ag. Take 


Uf = (a — day? { Aad? — On — a)? 
1 2 Quad (dy? f° 


Then it is routine to check that U is real orthogonal and that H = 
U? (diag A) U = @ i) is real symmetric with diagonal entries d1, do. 

Now suppose n > 2. If dj = 4, then (dz,...,dn) ~ (Ag,.--,An)- 
By induction hypothesis on n — 1, there exists an (n — 1)-square real 
symmetric matrix G having diagonal entries d2,...,d, and eigenval- 
ues Ag,...,An. Then take H = (\1) & G, as desired. 

If dy < 1, since the sum of all djs equals that of all Ajs, it is 
impossible that d; < A; for all 7. Let dy < Aq,...,dp < Ax, and 
drit > Ax+1 for some k > 1. Then Agiy < dear < dg < Ax. Put 
bane = Apyi+Ap—dkg. It follows that (ais Xe41) ~< (Ak; Ajit} By the 
above argument for n = 2, there is a 2 x 2 real orthogonal matrix U2 
such that the diagonal entries of UZ diag(Ax, Ax+1)U2 are dr, Ap die 

Now replace the 2 x 2 principal submatrix in rows k and k + 1 
of the identity matrix J, by U2 to get an n x n matrix V. Then 
V is real orthogonal. Set F = V"(diagA)V. Then F is real sym- 
metric, having eigenvalues \1,..., Ax, Ak41,---;An and diagonal en- 
tries A1,...dz, Nett phe Let = (Mga de, Net ...;An). Then 
d < N < 4; X and X differ by only two elements, and d and \’ both 
contain d,. Let d be the vector by deleting d; from d and rv by delet- 
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ing d, from \’. Then d = \. By induction, there exists an (n — 1)- 
square real orthogonal matrix U,_; such that U;"_, (diag r) Un—1 has 
diagonal entries d. Now construct an n-square matrix W = (wij) by 
inserting a row and column in U,_; so that wz, = 1 and all other 
ws are 0 in the row and column. Then W is real orthogonal and 


H=W7FW has diagonal entries d,,do,...,dy, as desired. UH 


Theorem 10.19 Let A be an n-square complex matrix. Then 
|\d(A)| <w o(A) (10.8) 


and 
|\(A)| <w o(A). (10.9) 


Proof. We may assume that the absolute values of the diagonal 
entries of A are in decreasing order. For each a;;, let t; be such that 


t,Qi = eel |ts| = 1; al err on 


Let B = Adiag(ti,...,tn) and let C be the leading k x k principal 
submatrix of B. Then B has the same singular values as A, and 


d(C) = (la11|,---; |@xel)- 


By Theorem 8.14, we have o;(C) < o;(B), i=1,...,k. 
Applying Theorem 9.6 reveals (10.8): 


|trC| = |jau|+---+ axel 
< o(C) +++ +04(C) 
< o1(B) + ---+0%(B) 


= o1(A) +--- + 0,(A). 


For (10.9), let A = U*TU be a Schur decomposition of A, where 
U is unitary and T is upper-triangular. Then 


|A(A)| = |d(Z)| and o(A) = o(T). 


An application of (10.8) gives (10.9). lf 
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Theorem 10.20 Let A, B, C ben x n complex matrices such that 


A B 
Cm o)= 
Then 
o(B) Xwtog A*/?(A) 0 Al/2(C) (10.10) 
and 
ACB) ee AA) AAC). (10.11) 


Proof. For any n x p matrix U and any n x q matrix V, we have 


& 0 )( A ae De U*AU ape 
0 V* BY C 0 V VEU Cy 7 

If B = 0, there is nothing to prove. Let rank(B) = r > 0 and 
choose U and V so that B = UDV* is a singular value decomposition 
of B, where D is the r x r diagonal matrix diag(o1(B),...,0,(B)), 
and U and V are n x r partial unitary matrices; that is, U*U = 
V*V =I,. Then U* BV = D. 

Denote by [X], the k x k leading principal submatrix of matrix 
X. Extracting such submatrix from each block for every k < n: 


[U*AU], [De 
( [Dlx me 


Taking the determinant for each block and then for the 2 x 2 matrix, 
det[D]?. < det({U* AU],) det([V*CV],). 


Or equivalently, for each 1 << k <r, 


k k 
[ [or < [Ix ((U* AU] x) Au([V*CV]x). 
1=1 i=1 


By the eigenvalue interlacing theorem (Section 8.3), we arrive at 


k k 
[[o(® <[Paa@nc) 
i=1 
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(10.10) follows by taking the square roots of both sides. (10.11) is 
similarly obtained by letting B = WTW*, where W is unitary and 
T is an upper-triangular matrix with eigenvalues \1(B), A2(B),..., 
An(B) on the main diagonal (Schur decomposition). 


Corollary 10.2 (Weyl) Let A be anyn xn complex matriz. Then 
[A(A)] ~Xiog 7A). (10.12) 


Proof 1. Note that |Ai(A)---An(A)| = o1(A)---on(A) = | det Al. 
To show the log-majorization, let A = UDV be a singular value 
decomposition of A, where U,V are unitary, and D is diagonal. Then 


|A*]} A \ fU O Dp: DP U* 0 So 

A* |A|/ \ 0 V* De: pee 
Applying (10.11) to the block matrix on the left gives the inequality. 
Proof 2. Let |Amax(X)| represent the largest modulus of the eigen- 


values of X. Then |Amax(X)| < o1(X). For any given positive integer 
k <n, consider the compound matrix A“). For any j1 <-++: < jr, 


Ts.) < Ainae( AY k))| < o1( (A™®)) [ou 


As log-majorization implies weak majorization, we have 
IMA)| <w (A), te, [A(A)| <w ACAD), 
which is (10.9). Note that this is weaker than the log-majorization. 
Corollary 10.3 (Horn) Let A,B Ee M,,. Then 
o(AB) <,. 0(A) 0 o(B). 
Proof. This is immediate from (10.10) by observing that 


bec f Af AB 


Note that for any n-square matrix X, [[/_, oi(X) =|detX|. J 


(10.13) 


354 Majorization and Matrix Inequalities Chap. 10 


Corollary 10.4 (Bhatia—Kittaneh) Let A,B © M,,. Then 
20;(AB) < 0;(A*A+ BB*), i=1,2,...,n. 
Proof. Use the block matrix in (10.13). By Problem 11, we have 
20i(AB) < 2i((A*,B)*(A*,B)) 


di ((A*, B)(A*, BY") 
\i(A*A + BB*) = 0;(A*A+ BB"). Bl 


Problems 
1. Let A € M,,. Prove or disprove each of the following identities. 

a) o0(A) =o(A*). 
(b) o(|Al) = o([A*)) 

c) ACA) = A(A*) 
(d) A(/Al) = A(|A*]) 

e) d(A) = d(A*). 

f) d(|A]) = d(|A*) 

g) o(A) = A(\A)). 

(h) o(A) = o(lAl) 


2. Let A be a Hermitian matrix partitioned as A = Ge 7: Show 
that Ai; ® Ag. = $(A+UAU*), where U =I @ (—J), and that 


A(Ai1 ® Aga) = (A(A11), A(A22)) < A(A). 


3. For any square complex matrix A = (a;;), show that 


max |a;;| < |Amax(A)| < Oinax(A)s 


4. Show by example that |d(A)| <wicg o(A) is not true in general. 
5. Show that |d(A)| ~~ |A(A)| for all normal matrices A and that it is 


not true for nonnormal matrix B = ie _\o Ce) ie 1) = & 4) . 
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6. 


10. 


11. 


12. 


13. 
14. 


15. 


Let A = (a;;) € M, and p be a permutation on {1,2,...,n}. Denote 
dp(A) = (a1p(1); A2n(2))++- sOnpin) )» Show that |d,(A)| ~<w a(A). 
0 Az 


cl * dog 


. Verify that Ao es , ) a= (2 : ) , where c = dy — \j. 


. Let A be a square complex matrix. Majorization (10.8) ensures that 


|d(A)| X,, o(A). Show that this <,, becomes ~ if and only if A = PU 
for a positive semidefinite matrix P and a diagonal unitary matrix U. 


. Let A be a square complex matrix. Majorization (10.9) ensures that 


|A(A)| <w (A). Show that this <,, becomes ~< if and only if A is 
normal; that is, |A(A)| < A(|A]) if and only if A is normal. 


Let A be an n x n positive definite matrix having diagonal entries 
d, > dz >---> dr, and eigenvalues A, > Ag > +--+ > An. Show that 


L eee ae k= 1,25 .0.97. 
k ik a yi 


i=k 


Let A, B, C be n x n complex matrices such that M = & a) > 0. 
Show that M —2N > 0, where N = ( a a) > 0, and that 
20;(B) <\i(M), 1=1,2,...,n. 


Referring to Corollary 10.4, show by example that A*A cannot be 
replaced with AA*; that is, it is not true in general that 20;(AB) < 
o;(AA* + BB"), even though 20;(AB) < 0;(A*A + BB*) for all i. 


Show that Corollary 10.3 implies Theorem 10.20 via Theorem 6.8. 
Let A € M,,. Show that 


(a) |A(A)? <w 07(A). 

(b) |d(A)|? <wiog a(|Al) © A(|A*))- 

(c) d(|Al) 0 d(|A*|) <w A(A*A). 

(d) |A(A+ A*)| <u A(|A| + |A*]) and [A(A 0 A*)| <u A(|A] 0 | A"). 


Let A, B, C € My be such that ( j @) > 0. Show that 


[Alloy [lBllop ana ( 0A) CB) 
Cine le.) es a eo 


where ||X||,>» and p(X) are the spectral norm and spectral radius of 
square matrix X, respectively. Do these hold for 3 x3 block matrices? 


© 
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10.5 Majorization for Matrix Sum 


Now we turn our attention to the eigenvalue and singular value in- 
equalities in majorization for the sum of Hermitian matrices. 


Theorem 10.21 (Fan) Let A, BEM, be Hermitian. Then 
A\(A + B) ~ \(A) + A(B). 


Proof. By Theorem 8.17, with 5; denoting a set of any k orthonor- 
mal vectors 21, 22,...,2% € C”, we see that the weak majorization 
A(A + B) <w A(A) + A(B) is equivalent to, for each k < n, 


k k k 

max ) z3(A+ B)x; < max ) x; Ax; + max ) 7 Be. 
Sp 4 i Sp 4 i Sp i 
i= — i 


Since tr(A+ B) = tr A+ tr B, the desired majorization follows. 


Theorem 10.22 Let A, BEM, be Hermitian. Then 
(A) — A(B) ~ X(A — B). 


Proof. Write A= B+(A-—B). By Theorem 8.18, 


k 


k k 


t=1 t=1 j=l 


which yields, for k = 1,2,...,n 


1<i1<- ms, 2, (iA) — i,(B )) < s+ B); 


j=l 


that is, 
(A) — A(B) ~y A(A — B). 


As equality holds when k = n, the desired majorization follows. 
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Note that AT(A#) = (An(@),...,A1(H#)) = —A(—A) for any nx n 
Hermitian matrix H. We may rewrite the above two theorems as 

A(A) + A"(B) ~ A(A + B) K X(A) + A(B); (10.14) 

(A) — A(B) ~ (A — B) WK X(A) — A*(B). (10.15) 


For singular value majorizations, if A is an mxn complex matrix, 
as usual, we denote by o(A) the singular value vector of A; the 


singular values of A are the eigenvalues of |A| = (A*A)!/?. As we 
know, for any matrix X, matrix X = ss is Hermitian and has 
eigenvalues 01(X),...,0,(X),0,...,0, -—o,(X),...,-—o1(X), where r 


is the rank of X. Applying Theorem 10.21 to A= & , and 


B= ( a FA ) gives the analogous majorization for singular values. 
Theorem 10.23 Let A and B bem x n complex matrices. Then 
o(A+B) <y o(A) + 0(B). (10.16) 


Applying Theorem 10.22 to A and B reveals the following ma- 
jorization on the difference (in absolute value) of singular values and 
the singular values of the difference of matrices. 


Theorem 10.24 Let A and B bem x n complex matrices. Then 
|a(A) — o(B)| Xw o(A — B). (10.17) 
Proof. By Theorem 10.22, \(A) — \(B) ~ (A — B); that is, 
(o1(A) — o1(B),...,0,...,0,...,01(B) — 01(A)) = (A — B). 


It follows that, for k = 1,2,...,n, 


k 


max ) 
1Si1 <+++<tp <n = 


k 
oi4(A) — 0(B)| < )00;(A-B). 
j=l 


Theorem 10.25 Let A and B be n-square positive semidefinite ma- 
trices and let z be any complex number. Then 


o(A — |z|B) wg 0(A + 2B) Xwtog (A + |2|B). (10.18) 
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Proof. For the second part, by (10.10), it is sufficient to notice that 


A+ |z|B A+zB 0 
A+z*B A+ |z|B 


Now we show the first majorization in (10.18). If A and B are 
nonnegative diagonal matrices, invoking the elementary inequality 
|a — |z|b]| < |a + zb|, where z € C, a,b > 0, we arrive at 


| det(A — |z|B)| < | det(A + zB). (10.19) 


Inequality (10.19) actually holds for all positive semidefinite A 
and B due to the fact that there exists an invertible matrix P such 
that P* AP and P* BP are both nonnegative diagonal (Theorem 7.6). 

Note that A—|z|B is Hermitian. By (10.19) and (10.12), we have 

n nm 

[[oi(4-l21B) < II} (A+2zB)|< We (A+2zB). (10.20) 

i= i=1 i=l 

We claim that the upper limit n for the products in (10.20) can 
be replaced by any positive integer k <n. To show this, let U be an 
n-square unitary matrix such that U*(A—|z|B)U = diag(A1,...,An), 
where |);| = 0;(A — |z|B). Write U = (U,, U2), where U, consists of 
the first k columns of U. Then Uf AU, and Uf BU, are k x k. Thus, 


Hou — |z|B) 


Io U}(A — |z|B)U) 


k 
= [[oi(UfAv, - |2\U7 BU) 
i=1 
k 
[[oi@tAti + zU7 BU) (by (10.20)) 
i=1 
k 
= [Jo(Uy(A+ 2B) 
i=1 
k 
< II oi(A + 2B 
i=1 
The last inequality is by Problem 11 of Section 8.3. I 


IA 
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” 


The following result is immediate since “~,,,,” implies “X,,”. 
g g p 


Corollary 10.5 Let A and B be n-square positive semidefinite ma- 
trices and let z be any complex number with |z| =1. Then 


o(A— B) <y 0(A+ zB) <y 0(A+ B). 
Theorem 10.26 Let A be ann xn complex matrix and let H(A) be 
the Hermitian part of A; that is, H(A) = (A+ A*)/2. Then for any 
nxn Hermitian matrix G, 


o(A—H(A)) <y o(A—G). (10.21) 


Proof. By Theorem 10.23, 


o(A-H(A)) = o((A-G)/2-(A-G)*/2) 
<y o((A—G)/2) + o((A— G)"/2) 
= o(A-G). I 


Theorem 10.27 Let A be an n x n positive semidefinite matrix. 
Then for anyn Xn unitary matria U, 


o(A— In) Xp o(A-U) Xy o(A+ In). (10.22) 


Proof. Because (10.22) holds if and only if it holds when A is re- 
placed with W* AW, where W is unitary, we may assume that A is 
a diagonal matrix with nonnegative diagonal entries. An application 
of Theorem 10.24 results in 


o(A-—In) = (lo(A)—o(U)|)* <u o(A-U) 
<y o(A)+o(U) =o0(A+I). Bt 
Corollary 10.6 Let A be an n x n complex matrix and A = UP 


be a polar decomposition of A, where U is unitary and P is positive 
semidefinite. Then for anyn x n unitary matrix V 


o(A—U) <y o(A-—V) <Xy o(At+T). (10.23) 
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Proof. Notice that 
o(A-—U)=o0(UP-U)=o0(P-T). 


Invoking Theorem 10.27 reveals the desired majorizations. I 


We end this section by presenting another result of Fan. 


Theorem 10.28 (Fan) Let A be ann x n matrix with eigenvalues 
Ai(A),..-,;An(A). Let Re A(A) = (ReAi(A),...,Rern(A)). Then 


Re A(A) ~ A(H(A)), 
where H(A) = (A+ A*)/2 is the Hermitian part of the matrix A. 
Proof. By the Schur triangularization theorem, we can write A = 


V*TV, where V is unitary and T is upper-triangular. Moreover, we 
may assume that Re ,(A) >--- > ReA,(A). Note that 


se di(H(A)) = tr H(A) = Retr A = Rey? yA) = oe di(A) 
i=l i=1 


i=l 
Now for partial sum, by min-max representation, we have 
W(H(A)) = t A))U* 
DA(H(A)) = max teUCH(A)) 


= max truV*(H(T))VU* 
Ui=h 

= max trW(H(T))W*. 
WW*=lt, 


Take W = (Jy, 0), kx n. Then 


k k a k 
Yo acH(ay) > do (MATA) _ yo Reai(A). 
i=1 i=1 i=1 
Problems 


1. Let z be a complex number. Show that |1 — |z|| < |1— z| < 1+ |z]. 
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2. Show that the following matrix inequalities do not hold in general 
|A—|2|B| <|A- 2B] < A+|z|B 
by taking z =i, A= eae and B = eal . 
3. Let A € M,, and x be any n-row complex vector. Show that 
(A+ x*a) ~ (A1(A) + a2*, A2(A),..., Xn(A)). 
4. Show by example that neither of the following holds in general. 


oi(A +B) > oi(A)+on(B), i= 1,2,-..5n3 


iM 
oe 
= 
+ 
B 
IV 

iM = 


4) om ii 4); k =1,2,...,n, 


where A, B € M,,. However, the ad inequality is true if the plus 
sign “+” on the right-hand side is replace by the minus sign “—”. 


5. Let A be an n x n complex matrix. Denote S(A) = ASA. Show 


that for any n x n skew-Hermitian matrix G 
a(A—S(A)) Xy o(A— G). 


6. Let A € M,, have a singular value decomposition A = UDV, where 
U and V are unitary. Show that for any n x n unitary matrix W, 


a(A—UV) Xy o(A-—W) Xy o(A+ UV). 
7. If A= (aj) € M, is normal and d(A) = (a11,.-..,@nn), Show that 
Red(A) ~ Re X(A). 


8. (Fan—Hoffman) Let A be a square complex matrix. Show that 
A*+A A*+A 
a( _ ) a( ~ )| 40 ota, 


9. Let A € M,,. Denote the (necessarily real) eigenvalues of the Hermi- 
tian matrix H(A) = ata by hy > hg >--- > hy. Show that 


hi < o;(A), t= 1,2 acesn 


However, if we rearrange the absolute values of these eigenvalues of 
H(A) in the decreasing order |h;,| > |hi,| > ++: > |i, |, show by 
example that |h;,| < o:(A) does not hold in general. 
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10. 


11. 


12. 


13. 


14. 


15. 


16. 
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Let A and B be n x n complex matrices. Show that 
o(A+ B) xy o(A @ B) Xw o(|A| +|B)). 


Let Aj,...,Am be n-square positive semidefinite matrices and let 
A1,---;Am be complex numbers. Show that 


| det(A1 Ai +--+ + AmAm)| < det(|A1|A1 + +++ + |Am|Am): 

Let A and B be n x n positive semidefinite matrices. Show that 
|A(A — B)| %wiog A(A + B). 

Let A and B be n x n positive semidefinite matrices. Show that 

a,(A—B) <o,(AGB) 


$= 415 25.4024% 


Let A and B be n x n positive semidefinite matrices. Show that 
(A(A + B), 0) < (A(A), d(B)). 


Let A and B be n x n positive definite matrices. Show that 


1 
Ai(A) + An-i+1(B) 


1 


<tr((A+B)") <)) apap 


i=1 


and 
I (A;(A) + Ana (B)) <det(A+B)< I (Ai(A) + ri(B)). 


Let A, B be nx n strict contractions, i.e., 71(A) < 


(TaAtA) (f= Ate 
= ( (I—BtA)-! (I= B*B)-! 


Show that 
(1,1,...,1) < d(H) ~ (A((I— A*A)~'), A((I — B*B)~")) ~ (A). 


d 


Show also that H > ¢ 1) to conclude that H has at least n eigen- 
values greater than or equal to 2. [Hint: Expand (I — XY)~1|] 
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10.6 Majorization for Matrix Product 


We have shown in the previous section (see (10.16)) that for complex 
matrices A and B of the same size 


o(A+ B) <y o(A) + 0(B). 
The analogue of this for product is: if A and B are n x n matrices, 
o(AB) <y 0(A) 0 0(B). (10.24) 
In particular, for n x n positive semidefinite matrices A and B, 
A(AB) <w A(A) 0 A(B). 


(10.24) was proved in Section 10.4 (see Corollary 10.3). In this 
section, we first present a different classic proof of it, study the ma- 
jorization inequalities of the matrix product AB, then move on to 
show the majorization inequalities concerning the power of product 
(i.e., (AB)™) and the product of the powers (i.e., A’ B™). 


Theorem 10.29 (Lidskii) Let A and B be n x n positive semidef- 
inite matrices and let 1 <i, < +--+ < ip <n. Then 


k 


k 
[Piu42) <[]% (10.25) 
t=1 i=] 


Equality holds when k = n. 


Proof. Note that A;(AB) < ;(A)A1(B) for any positive semidefinite 
matrices A and B and any index i (Theorem 8.12). Apply this to the 
compound matrix (AB). Because []/_, \;,(AB) is an eigenvalue 
of (AB)“) indexed by a = (i1,...,i%), we have 


k 
[[ru(AB) = ra((AB)™) = ra((AMB®) 


t=1 


IA 


k 
do( AM) (B) =] Ai,(A 


t=1 


364 Majorization and Matrix Inequalities Chap. 10 


Equality holds when k = n as det(AB)=detAdetB. JU 
Since o%(X) = \j(X*X) for any matrix X and index j, we have 
Corollary 10.7 Let A,B EM, and let 1 <i, <-+-<ip <n. Then 


k 


k 
[[o%(4B) < [on (Aon (8). (10.26) 
t=1 


t=1 


Equality holds when k =n. 
Taking 7 = t in the corollary reveals the log-majorization 
a(AB) <wg 0(A) 0 o(B). (10.27) 
In particular, if A and B are positive semidefinite matrices, then 
A(AB) Rig A(A) 9 ACB). 


By Theorem 10.15, (10.27) implies the weak majorization (10.24). 
The next theorem gives lower bounds for o(AB) and \(AB). 


Theorem 10.30 Let A and B be nx n complex matrices. Then 
o(A)00'(B) ~ug 0(AB);  o(A)0o0'(B) ~w o(AB). 

In particular, if A and B are positive semidefinite, then 
A(A) 0 AT(B) Xi¢ A(AB);  A(A) 0 AT(B) Xp A(AB). 


Proof. It is sufficient to show the first log-majorization. We may as- 
sume that A and B are nonsingular by continuity. Note that (10.26) 
also holds when AB on the left-hand side is replaced by BA. Taking 
logarithm for both sides of (10.26) with BA for AB, we get 


k k 
max S> (In o;,(BA) — Ino;,(A)) < S"Ino,(B). 
t=1 


1<i,<-<ap<n ra 
It follows that Ina(BA) — Ino(A) ~ Ino(B). With A replaced by 
B~‘A and then B by AB, we obtain Ing(A)—Ino(B™) < Ino(AB), 


ie., In(a(A) o o(B)) X Ino(AB), or (A) 0 0'(B) X..g 0(AB). OW 
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In summary, we have for n x n complex matrices A and B 
a(A)o0'(B) %ig (AB) ~iog (A) 0 0 (B); (10.28) 
a(A)oo'(B) <y o(AB) <w (A) 0 o(B). 
In particular, if A and B are positive semidefinite, then 
A(A) 0 AT(B) %igg A(AB) ~io¢ A(A) 0 A(B); (10.29) 
A(A) 0 AT(B) Xwy A(AB) %w (A) 0 A(B). 


We now study the majorization inequalities concerning the power 
of product, (AB), and the product of the powers, AB”. 

By (10.27), it is immediate that for any positive integer m and 
n X n positive semidefinite matrices A and B, 


A(A™B™) *1¢ A (A) 0 X"(B) 
(here A(X) = (A¢(X),---,A8(X)) = (r(X))% + n(X))%). Thus 
NCAP RM) 2,2, MAY ON): 
This says that \!/™(A™B™) is bounded above in majorization by 
\(A)oA(B). In what follows we show it is bounded below by (AB), 
or equivalently, A((AB)™) <,. A(A™B™). Putting these together, 
MAB) Sige APB™) Sinn ACA) © AB), 


In fact, more can be said about \!/™(A™B™), as we show. 
We have seen that for n x n positive semidefinite matrices A, B, 


A (AB) < Ai(A)A1(B),  An(AB) 2 An(A)An(B), 
and that for arbitrary n x n matrices X,Y, 
AU(XY)| < o1(XY) < o1(X)or(Y) 


and 
lAntX ¥)| So, LY) > onl X \ealY ). 
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Theorem 10.31 Let A and B ben x n positive semidefinite matri- 
ces. Then for any positive integers p and q, where p< q, 


\1(AB) < }/?(AP BP) < djy/4(AIBY) < d4(A)Ai(B) 
and 
\n(AB) > AL/P(AP BP) > d1/9(A9BY) > Ay (A)An(B). 
Proof. We use induction on positive integer m to show that 
AmB) age, (10.30) 
If m = 1, then 
\1(AB) < o1(AB) = dy?(ABBA) = 1/?(A2B?). 
Suppose that the inequalities hold for integers no more than m. Put 
X= AMYPABM+D/2 yp = Blm-1/2 4(m—1)/2_ 
Then \1(XY) = 41(A™B™). However, 
MIX) Se(a(¥) = 2) (A Be ae), 
By induction hypothesis on m — 1, we have 


M(A™B™ < ME(Am pet) 1/2 (aml pm-l) 
< pride © aes = aa ma ls Os 1 


Thus 
yor+)/2m (4m pm) a (Am Bes, 


that is, 
A AB) < A ai), 


This proves (10.30). On the other hand, for any positive integer m, 
a! (A™B™) < ay!™(A™)Ag!™(B™) = du (A)n(B). 


The second part is similarly proven. I 
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Theorem 10.32 (Wang-—Gong) Let A and B be n x n positive 
semidefinite matrices. Then for any positive integers p and q, p <q, 


MAB) ~tog A¥/P(AP BP) tog A1/4(A7B2) ~1og A(A) 0 A(B). 
Proof. We only show that for any positive integer m, 
AR ge ae), 
Considering the compound matrix (Arp), where k < n, we have 
(am Bm) = (Amy (BmY® = (AMYm(BOHYm. 
Moreover, 
k 
a ((AM)™(B®)") = i ((Aa™B™)) = TT (AB). 
i=1 
Note that A“) and B“) are positive semidefinite. By (10.30), 


k 
[Dv/"anem) = a/"((a®y(8®)") 


i=1 


IA 


ile Anya) 


k 
= II UL Are gmt ly, 
i=1 


When k = n, equality holds as for any positive integer m, 
nm 
II AB = det Adet B. 
i=l 


The previous theorem refines \(AB) <,,, A(A) o A(B). Our next 
theorem shows an analogous result for A(A) o AT(B) <,.. A(AB). 


Theorem 10.33 (Wang-—Gong) Let A and B be n x n positive 
semidefinite matrices. Then for any positive integers p and q, p <q, 


(A) 0 AT(B) Riog AV(AMIBYS) ,,, AP(AMP BYP) ~,,. (AB). 
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Proof. By the first log-majorization of (10.29), we have 
MAU) ott (BY) 2. AAV BY), 
Raising both sides to the mth power, we arrive at 
(A) 0 AT(B) Aig AM(AU™ BU), 
We now show that 
ym al/mtl pi/mtly 2 ym 4i/m Bl/m) 
By Theorem 10.31, we have (Problem 5) 
Art gl/m+ gi /m+1) < (AN BL) < 4 (AB). 


The desired log-majorizations follow by making use of compound 
matrices as we did in the proof of Theorem 10.32. i 


The following corollaries are immediate from the theorems. For 
instance, with A replaced by A™ and B by B™ in the log-majorization 
X™(AV™ Bl/m) 2.) (AB), we obtainA” (AB), \(A™B™), which 
results in the weak majorization \"(AB) <, A(A"B"™) . 


Corollary 10.8 Let A and B benxn positive semidefinite matrices. 
Then for any positive integer m, 


N™(A) 0 (A™(B))* ~tog A (AB) Xtog AA™B™) <tog A™(A) 0 A™(B) 
and 


(A) 0 (A™(B))? <p A(AB) <p A(A™B™) <, A™(A) o A™(B). 


Corollary 10.9 (Lieb—Thirring) Let A and B be n x n positive 
semidefinite matrices. Then for any positive integer m, 


tr(AB)™ < tr(A™B™). (10.31) 


Equality holds if and only ifm =1 or AB = BA. 
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Proof. The inequality is readily seen from the previous corollary. 
We show that equality occurs if and only ifm =1 or AB = BA. 

Sufficiency is obvious. For necessity, assume that equality holds 
for some m > 2. We first consider the case m = 2; that is, tr(AB)? = 
tr(A?.B?). Without loss of generality, we may assume that A is a 
diagonal matrix with diagonal entries aj,...,a@,. Then 


So af big]? — S > asayldigl? 
i,j i,j 
= Sola — a;)?|bi)? = 0. 


i<j 


tr(A?B?) — tr(AB)” 


Thus a;bj; = ajbj; for every pair of 7 and j; that is, AB = BA. 

Let m > 2. We show that tr(AB)™ = tr(A™B™) implies tr(AB)? = 
tr(A?.B?), which leads to AB = BA, as we have just proven. 

If tr(AB)? 4 tr(A?B?), we apply the strictly increasing and 
strictly convex function f(t) = t'”/2, t > 0, to the weak majorization 
d?(AB) <w (A?B?). By Theorem 10.14, we arrive at 


tr(AB)™ = 3 ym? ((AB)?) < 3 y/? 4? B?), 
1=1 i=1 


On the other hand, since t <y y > 2™ <yw y™ for z,y € RY 
(Problem 22, Section 10.3) and \!/2(A?B?) x, \/™(A™B™) (The- 
orem 10.32), which results in \’"/?(A?.B?) <, \(A™B™), we have 


So A (4B) <> APB”) = te( AB). 
i=1 i=l 


This contradicts the assumption that tr(AB)™ =tr(A"B™). i 


Problems 


1. Show that the inequalities in Theorem 10.31 hold for singular values 
01 (resp.,@p) in place of »1 (resp.,An) when A and B are normal 
matrices. If A or B is not normal, then they are not true in general: 
Take A = | a) and B = I, as an example. 
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. Let A and B be positive semidefinite matrices of the same size, and 


p and q be positive numbers with 1 < p < ov, 7 + Fi = 1. Show that 
tr(Al/? BY) < (tr.A)!/? (tr B)1/4 


and 
(tr(A + B))'/? < (tr A?)!/? + (tr BY)1/4, 


. Let A be an nxn positive semidefinite matrix, and p and q be positive 


numbers such that 1 < p < 00, : +4=1. Show that for all n x n 
positive semidefinite matrices B with tr BY = 1, the inequality 


tr(AB) < (tr A?)1/? 


holds. Show that equality occurs if and only if BY = AP : 


. Let A and B be positive semidefinite matrices of the same size. Show 


that Ay’(AB) < Ay(A™)A1(B™) for any positive integer m. 


. Let A and B be positive semidefinite matrices of the same size. Show 


that Ay"(Al/™ B1/™) decreases as m increases; that is, 


er Ae ae) a Aven. 


. Show Theorem 10.32 for singular values with A and B both normal. 
. Let Ao B = (aj;b;;) be the Hadamard product of A = (a;;) and 


B = (bij). Show, with A= (5) and B= (;}), that the inequality 


>; Aji" (A)An—i+1(B) < tr ((A 0 B)™) 


does not hold in general for positive semidefinite matrices A and B. 


. Let A be a square matrix. Show that 


o°(A) Xo(A?) © Ais normal. 


. Referring to (10.28), show that for n x n complex matrices A and B, 


o(A)0oo'(B) Xi o(BA). Note that o(AB) 4 o(BA) in general. 
Let A and B be positive semidefinite and 0 < r < 1. Show that 


M(A"B") < Af(AB). 


The inequality is reversed for r > 1. 
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11. 


12. 


13. 


14. 


15. 


16. 


Let A and B be n x n positive semidefinite matrices and k be a 
positive integer, k <n. If 1 < ty < tg < +++ < tp <n, show that 


(a) TIL, A(AB) > TT Xi, (A)An—ie 41 (B). 
(b) Sy Ae(AB) > Ly Yi, (A)An—ie 41 (B). 
(c) tr(AB) > 0%, Ae(A)An—e41(B). 


Let A and B be n x n complex matrices and k be a positive integer, 
k<n. If 1 <i, <ig <-++ <i, <1, show that 


k 


k 
II nn (AB) 2 II Ci, (A)on-t41(B), 


t=] 


but it is not true in general that 
k 
Yul (AB) > 2% (A) on—e41(B). 


Let A, B, and C be n x n positive definite matrices. Show that 
In\(A7'C) ~ Ind(A71B) + InX(B7'C). 
Let A be an n x n matrix. Show that for any positive integer k, 
o(A*) ioe o* (A). 
Deduce that 
tr ((A*)"A*) < tr(A*A)*. 
Show that equality holds if and only if A is a normal matrix. 


Let A and B be n x n positive semidefinite matrices and k and m be 
positive integers, k <m. Show that 


tr(A®BEY™ < tr(A™B™)F, 


Let A and B be nxn Hermitian matrices and m be a positive integer. 
Show that 
Any”) <a; 


iA )F |< < ie APB), 
Show by the example A = ¢ . _ and B= CG. 0) that in general 


|tr(AB)3| ¢ tr(A3B?). 


© 
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10.7 Majorization and Unitarily Invariant Norms 


This section shows a close relation between the weak majorization 
and unitarily invariant matrix (vector) norms. To be precise, for 
A,B € Mmxn, we show that o(A) <, o(B) if and only if || Al] < ||B]| 
for all unitarily invariant matrix norms |] - || on Mm xn. Note that 
o(A) = A1/2(A* A) is an n-vector for mx n matrix A. The symmetric 
gauge functions that we introduce below serve as a “bridge” between 
majorization and the matrix norm. 

A symmetric gauge function is a real-valued function defined on 
R” that is invariant under any permutation of the components of the 
vectors and any change of signs of the components. To be exact, @: 
R”" > R is a symmetric gauge function if the following are satisfied. 


a. o(x) > 0. And ¢(x) = 0 if and only if « = 0. 
b. (ex) = |eld(2), cER. 
o(x +y) < ox) + oy). 

( 


. 
d. o(21,%2,... 0a) = P(Zp(1) Zp(2)> eae ate.) where p € Sp. 
e. $(@1,£2,.--,2n) = O(€121, €2%2,...,€nIn), where all e; = +1. 


Apparently, a symmetric gauge function is a norm on R”. Recall 
that a vector norm is a function satisfying (a), (b), and (c). 

One may check that v(x) = max; |x;| and w(x) = 5°; |x| are 
symmetric gauge functions. Moreover, if @ is a symmetric gauge 
function, then (a) ¢(x) = ¢(|z|) and (b) O< a< y= G(x) < A(y). 
We demonstrate the proof of (b) for the case n = 2. Let x = (21, x2) 
and y = (y1, y2). Let t; be such that tyy,; = 21. Then 0 < t; < 1 and 


o(z) = (1,22) = O(t1y1, £2) 
= ¢ (+ _ (yi, 2) + Sn.) 


2 
L+¢ 1—% 
< —s7o(y1, 22) + > o(-1, 22) 
= b(y1, £2). 


Likewise, $(y1, 22) < (y1, y2) = o(y). Thus g(x) < (y). 
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One can prove (Problem 5) that the /,-norm on R” 


= 1/p 
lelp= (Sole), pd 
i=1 


is a symmetric gauge function, and so is the Ky Fan k-norm on R” 


k 
lll) = 1c Een oo Iai. 


Theorem 10.34 Symmetric gauge functions are Schur-convex on R".. 


Proof. Let z, y € R}. and x < y. We need to show that ¢(x) < ¢(y) 
for any symmetric gauge function ¢ on R”. By Theorem 10.8, there 
exists a doubly stochastic matrix A such that x = yA. By Theo- 
rem 5.21, every doubly stochastic matrix is a convex combination of 
permutation matrices. Write « = )>/", tiyP;, where t; are positive 
numbers adding up to 1 and P; are permutation matrices. Then 


o(x) = o( twP) < HeWP) = esy) = 9). & 
a=1 i=1 i=1 


Theorem 10.35 Let x,y € R”. Then |x| <w |y| if and only if 
o(x) < o(y) for all symmetric gauge functions ¢ on R”. 


Proof. For sufficiency, take @ to be the Ky Fan k-norm, k = 
1,2,...,n. Then ¢(x) < @(y) yields |x| <, |y|. For necessity, since 
|x| <w |y|, there exists a nonnegative vector u such that |x| ~ u < |y|. 
Theorem 10.34 implies that (x) = $(|z|) < o(u) < (ly!) = o(y). OT 


Recall that a matrix-vector norm (see Section 4.2, Chapter 4) on 
the vector space Mm xn is a function ||-|| from Mm xn to Ry satisfying 


i. || Al] > 0, and ||Al| > 0 if and only if A = 0, 
ii. ||cA]] = |e||| Al], ce C, and 
iii, A+ Bll < AI + [BIL 
It is further said to be unitarily invariant if 


v. ||UAV]| = ||A]| for all A € Mmyn and unitary U and V. 
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Note that the multiplicative condition ||AB|| < ||Al||| Bl] is not 
required for a matrix-vector norm. 

If @ is a symmetric gauge function on R” and A is an mx n 
matrix, then o(A) € R” and ¢(0(A)) is well defined. Moreover, 
o(o(UAV)) = ¢(0(A)) for all m x m unitary U and nxn unitary V. 
The following two theorems, due to von Neumann, best characterize 
the relation between a unitarily invariant matrix-vector norm and 
symmetric gauge functions through singular values. 


Theorem 10.36 Jf ¢:R”" > R is a symmetric gauge function, then 
| Alls = @(a(A)) ts a unitarily invariant norm on Mmxn- 


Proof. Conditions (i) and (ii) are obviously satisfied. To show (iii), 
let A and B be m x n matrices. Because 0(A+ B) <y, o0(A)+0(B), 


o(0(A + B)) < o(o(A) + o(B)) 
< $(o(A)) + o(6(B)) = Alle + IIBlle- 


A+ Bll¢ 


Thus || - ||g is a matrix-vector norm. It is unitarily invariant because 
o(UAV) = o0(A) for any mx m unitary U andnxnunitaryV. 


Theorem 10.37 /f||-|| is a unitarily invariant (matrix-vector) norm 
on Minxn, then there exists a symmetric gauge function ¢ on R” such 
that || A|| = ¢(o(A)) for all m x n compler matrices A. 


Proof. If m > n, for x = (#1,...,%n) € R", define an m x n matrix 
M, whose (i,1)-entry is 2j, 1 = 1,2,...,n, and 0 elsewhere. Let 
o(x) = ||M;,||. Then ¢ is a function from R” to R, and it is readily 
seen that it satisfies (a) and (b). For (d) and (e), it suffices to note 
that ||PM,Q]| = ||M_|| for permutation matrices P and Q and for 
diagonal matrices Q with +1 on the main diagonal. For (c), we have 


o(e@t+y) = ||Ma+yll = || Me + My| 
<  ||Mal| + ||Myl] = o(x) + 6(y). 


Thus, ¢: R” } Ry is a symmetric gauge function. For any m x n 
matrix A, o(A) = \¥/?(A*A) € R%. Let A = UAV be a singular 
value decomposition of A, where Ag = M,,4). Then 


|All = [Aol] = Moayll = ¢(o(A)). 
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If m <n, we define ||X|| = ||X7||, where X is n x m. Then || - || 
is a unitarily invariant norm on M, x. The above argument ensures 
a symmetric gauge function y : R™ +> Ry such that y(o(X)) = 
||X || = ||X7||. Define ¢ : R” + Rx by (x) = y(Z), where Z is the 
m-vector consisting of the first m largest components of x in absolute 
value. Then ¢ is a symmetric gauge function on R” (Problem 4). 

Now for A € Mmxn, A* is n x m, o(A) is an n-vector, and a(A) 
is an m-vector that differs from o(A) by n — m zeros. Notice that 
A* A, AA*, and AA* = (A™)* A” have the same nonzero eigenvalues. 
Thus o(A) = 0(A7). It follows that 


(o(A)) = y(6(A)) = g(o(A7)) = ||A7 | = |All. 
Theorem 10.38 (von Neumann) Let A, BE Mm xn. Then 
o(A) xy o(B) = |All < || ll 


for all unitarily invariant matrix-vector norms || - || on Mmxn- 


Proof. Theorem 10.37 says that unitarily invariant norms are es- 
sentially the same as symmetric gauge functions. On the other 
hand, Theorem 10.35 ensures that o(A) < o(B) if and only if 
(o(A)) < ¢(o(B)) for all symmetric gauge functions ¢. J 


Theorem 10.39 (Fan Dominance Theorem) Let A, BE Mmxn. 
Then ||Al| < ||B\| for all unitarily invariant matrix-vector norm 
|< || on Mmxn af and only #f ||Allay < [Blla, # = 1,2,-.¢ = 
min{m,n}, where ||Al|(x) denote the Ky Fan k-norms on Mmxn : 


Alley = Soa bln = iene, HY. 


Proof. Ky Fan k-norms on M,,., are unitarily invariant; so necessity 
is obvious. Conversely, that ||Al|(z) < ||B||(x) for each k is the same 
as 0(A) <, o(B). By Theorem 10.38, the sufficiency follows. 
A variety of matrix inequalities on unitarily invariant norms fol- 
lows at once from Theorem 10.38. For instance, by Theorem 10.25, 
for any unitarily invariant norm ||- || on M,, A,B € Mn, and z €C, 


||A — [2] Bl < A+ 2B] < ||4 + [2|B], 
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and Corollary 10.8 ensures that for a unitarily invariant norm || - || 
on M,,, positive semidefinite A,B € M,,, and positive integers m, 


Problems 


ie 


|(AB)™|| < AB". 


If || - || is a vector norm on R” satisfying ||~P|| = ||a|| for any n x n 
permutation matrix P, show that such a vector norm is Schur-convex. 


. Show that a symmetric gauge function on R” is a convex function. 


. Let || -|| be a vector norm on R”. Show that || || || = |||] for any x € 


R” if and only if0 <a < y= |la|| < |ly||, where |a| = (|x1|,...,|anl). 


. Let y be asymmetric gauge function on R™. Let n > m. For x € R”, 


let & be the vector of the m largest components of x in absolute value. 
Show that (x) = y(#) is a symmetric gauge function on R”. 


. Show that the J,-norm || - ||, (or the Schatten p-norm) and the Ky 


Fan k-norm || - ||(x) defined on R” are symmetric gauge functions: 


k 
el (SoA), 92 ten = ea 


. For m x n matrix A, let ||Al|(,) be given as in Theorem 10.39; for 


x € R", let ||2||(,) be defined as in the previous problem. Show that 


All (xy = o(AD I ay 


. Extend the definition of symmetric gauge function for C” as follows. 


Replace the conditions (b) and (e) by, respectively, (b’). (cx) = 
Iclo(x), ¢ € C and (e’). d(x) = 9(|2]), where || = (|x1|,---,|@n])- 
For x,y € C”, show that |z| <, |y| if and only if (a) < @(y) for all 
symmetric gauge functions ¢ on C”. 


. We have used || - ||(,) to denote the Ky Fan k-norms for both matrices 


(in Theorem 10.39) and vectors (in Problem 5). If M is a matrix 


with singular value vector v = o(M), show that ||M||(x) = |lvll(a)- 
Let A = G ae Compute ||Al|(1). If A is considered as a vector in 


R4, say u, find ||ul|q). Are ||Al](1) and ||ul|(1) the same? 


. Show that || A]] = || | Al] || for any unitarily invariant norm ||- || on M,, 


and A € Mn, where |A| = (A*A)!/? 
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10. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 


Show that || - || defined below is a unitarily invariant norm on M,: 
|| Al] = max{|tr(AX)|: X € M,, tr(X*X) = 1}. 
Let ay > ag >--: > ay > 0 be given. Define 


[Alla = $5 aj0;(A), AEM). 
w=1 


Show that ||: ||. is a unitarily invariant norm on M,,. 


Let A € M, and let Aj, A2,-.-,An be the eigenvalues of A. Denote 
by A the diagonal matrix diag(\j, A2,...,An). Show that for every 
unitarily invariant norm on Mp, ||A|| < |All. 


Let A and B be mxn complex matrices with singular values 0,(A) > 
+++ > 0,(A) and o,(B) >--- > 0,(B), respectively. Show that 


| diag(o1(A) — 01(B),...,n(A) — on(B))I| < | - Bl 

for unitarily invariant norms on M,,,..,,. [Hint: Use Theorem 10.24.] 

Let A,B € M,,. Show that for every unitarily invariant norm on M,,, 
2||AB|| < ||A*Al| + ||B* BI. 


Let A,B € M,. As is known, |A+ B| < |A| + |B is not true in 
general, where |X| = (X*X)!/? (see Section 8.6). However, show by 
Theorem 8.22 that for any unitarily invariant norm || - || on M,, 


[|A+ Bill <MAT + IBTT. 


Let A, B € M,,. Show that for unitarily invariant norms || - || on Mon, 
A+B A+B 
| oA | < ae at= tale iaii < M04l +121) 00h 


Let A=U ‘e °) V be asingular value decomposition of A, where U 


and V aremxm and nxn unitary matrices, respectively, D is an rxr 
positive diagonal matrix, and r = rank(A). Let At be the Moore- 


Penrose inverse of A, i.e., At = V* Gy °) U*. Let A,B EMipxn. 


Show that for any n x n matrix X and unitarily invariant norm ||- ||, 


|| A(ATB) — Bll < |AX — Bl. 
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18. 


19. 


20. 


21. 


22. 


23. 


24. 
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Let A be an m x n complex matrix. Show that the Moore—Penrose 
inverse Al of A is the only matrix satisfying all the equations 


AXA=A, XAX =X, (AX)* = AX, (XA)*=XA. 


Give a unitarily invariant matrix-vector norm || - || on M,,, for which 

|| AB|| < || Al] ||B]| does not hold for some matrices A, B € M,,. 

Let A € M, and ||-|| denote a unitarily invariant norm on M,,. Prove 
) For any Hermitian X € Mn, ||A — $(A+ A*)|| < ||A- X]]. 

(b) For any skew-Hermitian X € M,, ||A—$(A-—A*)|| < || A—X]]. 
) If A = UP is a polar decomposition of A, where U is unitary 

and P is positive semidefinite, then for any unitary X € M,, 

|A—Ul| <A— XI < A401. 

(d) If A = UDV is a singular value decomposition of A, where U 
and V are unitary, and D is nonnegative diagonal, then for any 
unitary X € M,, ||A—UV]|| < ||A-— X]| < ||A+ UV]. 

(e) If Aisnormal, then ||A|| < || X~1AX|| for all invertible X € My. 

(f) If A is positive semidefinite, then ||A—J|| < || A—X]] < || A+]|| 
for any unitary X € M,. 

Let A and B ben x n normal matrices. Let Ao B be the Hadamard 
(entrywise) product of A and B and |A| = (A*A)!/?. Show that 


| Ao Bll < [Alo |B] 


for unitarily invariant norms on M,,. [Hint: Note that ¢. re > 0.) 


Let A,B € M,. If the product AB is normal, show that ||AB|| < 
||BA]| for any unitarily invariant matrix-vector norm || - ||. 


Let p and q be positive real numbers such that . + ; = 1. Show that 
|| ABP? = | ABP? < || [API BI 
for all A, B € M,, and for all unitarily invariant norms || - || on M,,. 
(Horn—Johnson) Let || - || be a unitarily invariant matrix-vector 
norm on M,,. Show that the following statements are equivalent. 
(a) ||diag(1,0,...,0)|] 2 1. 
(b) ||Al] > omax(A) for all matrices A € M,. 
(c) ||Al] > p(A), the spectral radius of A, for all matrices A ¢ M,. 
(d) ||AB|| < ||Al| ||B]| for all matrices A,B € M,. 
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Rec, 129, 195 
Imc, 294 

w, 139 
$333 

di3, 266 
VoaW,4 
VUW, 26, 68 
P>=Q,4 
PS Q, 32 
Span S, 3 
dim V, 3 
V+tW,4 


Sei, B8198 
d(x,y), 182 
|x|, 327 

I|a'||, 28 


\|z\|p, 373 


\|z||(xy, 373 
at, 2 
a+, 28 


n X n (i.e., n-Square) complex matrices 

m X n complex matrices 

complex numbers 

real numbers 

a field of numbers, i.e., C or R in this book 

rational numbers 

(column) vectors with n complex components 
(column) vectors with n real components 

(column) vectors with n nonnegative components 
polynomials over field F 

polynomials over field F with degree at most n 
real-valued continuous functions on interval [{a, b] 
real-valued functions with continuous derivatives on R 
real part of complex number c 

imaginary part of complex number c 

nth primitive root of unity 

t* =tift>0;t =O0ift<0 

Kronecker delta, i-e., 6;; = 1 if i=, and 0 otherwise 
intersettion of sets V and W 

union of sets V and W 

statement P implies statement Q 

statements P and Q are equivalent 

vector space spanned by the vectors in S 

dimension of the vector space V 

sum of subspaces V and W 

direct sum of subspaces V and W 

differential operator 

nth symmetric group, i.e., all permutations on {1,2,...,n} 
vector with ith component 1 and 0 elsewhere 

inner product of vectors u and v, i.e, v*u 


angle between real vectors 2, y, i.e., Zx,y = cos > We 


distance between x and y in a metric space 
absolute value vector |x| = (|x1|,...,|@n|) 


length or norm of vector x 


1/p 
[p-norm of vector 2, ie., ||z||p = (ey jar,|? 


Ky Fan k-norm of vector 2, i.e., |[||(4) = maxi, <...<i, al 


transpose of x; it is a column vector if x is a row vector 
vectors orthogonal to vector x 
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S*, 28 

Si 152, 28 
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In, I, 9 

A = (aiz), 8 
A™,9 

A, 9 

A*,9 
A-*,.13 
At, 377 
Ai, 41, 217 
A(i\j), 18 
adj(A), 13 
det A, 12 
rank (A), 11 
tr A, 21 


Im A, 17, 51 
Ker A, 17, 51 
R(A), 55 
C(A), 55 
R(A, 306 
C(A), 306 
H(A), 233 
S(A), 361 

W (A), 107 
Jn, 152 

Tn, 133 

A, 150 

Vn (ai), 143 
G(ai), 225 
sn(a;), 124 
w(A), 109 
p(A), 109 
i4(A), 255 
i_(A), 255 
io(A), 255 
In(A), 256 

Aecascl A), 124 
Omax(A), 109 
o1(A), 266 
Amin(A), 266 
Omin(A), 266 
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Notation 


vector space orthogonal to set S 

(a, y) = 0 for all a € S; and y € Sp 

eigenspace of the eigenvalue \ 

identity matrix of order n 

matrix with entries aj; 

transpose of matrix A 

conjugate of matrix A 

conjugate transpose of matrix A 

inverse of matrix A 

Moore—Penrose inverse of matrix A 

principal submatrix of matrix A in the upper-left corner 
matrix by deleting the ith row and jth column of matrix A 
adjoint matrix of matrix A 

determinant of matrix A 

rank of matrix A 

trace of matrix A 

diagonal matrix with the elements of S on the diagonal 
determinant of the 2 x 2 block matrix 

matrix inner product, ie., (A, B),, = tr(B* A) 

image of matrix or linear transformation A, i.e., Im A = {Ax} 
kernel or null space of A, i.e., Ker A = {x : Ax = 0} 

row space spanned by the row vectors of matrix A 
column space spanned by the column vectors of matrix A 
row sum vector of matrix A 

column sum vector of matrix A 

Hermitian part of matrix A, i.e., 3(A + A*) 
skew-Hermitian part of matrix A, ice., 3(A — A*) 
numerical range of matrix A 

n-square matrix with all entries equal to 1 

n-square tridiagonal matrix 

Hadamard matrix 

n-square Vandermonde matrix of a1,...,@n 

Gram matrix of 71,...,%n 

kth elementary symmetric function of a1,..., an 
numerical radius of matrix A 

spectral radius of matrix A 

number of positive eigenvalues of Hermitian matrix A 
number of negative eigenvalues of Hermitian matrix A 
number of zero eigenvalues of Hermitian matrix A 

inertia of Hermitian matrix A, i.e., In(A) = (i+(A), i—(A), io(A)) 
largest eigenvalue of matrix A 

largest singular value of matrix A, i.e., the spectral norm of A 
largest singular value of matrix A; the same as Omax(A) 
smallest eigenvalue of matrix A 

smallest singular value of matrix A 


Xi(A), 21, 82 eigenvalue of matrix A 
o;(A), 61, 82 singular value of matrix A 
(A), 349 eigenvalue vector of A € My, i-e., A(A) = (A1(A),.--, An(A)) 
o(A), 349 singular value vector of A € Mm xn, ie., ¢(A) = (o1(A),.. 
X8(A), 365 A®(A) = (AF(A), --- AS(A)) = ((2(A)), «2 On (AY) 
o°(A), 365 o°(A) = (09(A),...,08(A)) = ((01(A))% 5 (nA) 
p(A)la(A), 94 p(A) divides q(A) 
d(A), 94 invariant factors of A-matrix AJ — A 
d(A), 349 vector of diagonal entries of a square matrix A 
ma(A), 88 minimal polynomial of matrix A 
pa(A), 21, 87 characteristic polynomial of matrix A, i.e., pa(A) = det(AI — A) 
A> 0, 81 A is positive semidefinite (or a nonnegative matrix in Section 5.7) 
A> 0, 81 A is positive definite (or a positive matrix in Section 5.7) 
A> B,8l A — B is positive semidefinite (or ai; > bi; in Section 5.7) 
A‘/? 81, 203 square root of positive semidefinite matrix A 
A®, 81 A® = U* diag(Af,...,AR)U if A = U* diag(Ay,...,An)U 
e“, 66 Veo 
A], 219 principal submatrix of A 
Ala|8], 122 submatrix of A indexed by a and 6 
A\, 83, 287 |A| = (A*A)?/? (or (|ai;|) in Section 5.7) 
Aus 217 Schur complement of Ai 
A“) 199 kth compound matrix of matrix A 
Al], 113 norm of matrix A 
Allop, 113 operator norm of matrix A, ie., || Allop = supy,.)—; || Aa!| 
Alle, 115 Frobenious norm of matrix A, i.e., || Al|z7 = (OL, o?(A))/? 
Allc.), 115 Ky Fan k-norm of matrix A, ice., || Al|(a) = 08, o:(A) 
Al|p, 115 Schatten p-norm of matrix A, i.e., ||Al|p = (Sj, 7 (A))*/? 
1/2 
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A, B], 305 commutator of A and B, i.e., [A,B] = AB-— BA 
A@®B, 11 direct sum of matrices A and B,i.e., A® B= ie 2) 
A®B,117 Kronecker product of matrices A and B 
Ao B, 117 Hadamard product of matrices A and B 
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definition, 8 
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